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POLYA TYPE DISTRIBUTIONS IV. SOME PRINCIPLES OF SELECTING 
A SINGLE PROCEDURE FROM A COMPLETE CLASS’ 


By SAMUEL KARLIN 
Stanford University 


0. Introduction. In previous publications [1], [2], and [3], various aspects of 
decision theory in which the underlying distributions are Pélya type have been 
studied. For example, complete classes of decision procedures were determined, 
all Bayes procedures were characterized, and the problem of admissibility was 
investigated as related to various kinds of loss functions. 

Usually the minimal complete class of decision procedures, to which the stat- 
istician would obviously restrict himself in practical application, is still quite 
large. Consequently, without any additional knowledge or further conditions, it 
is a hopeless task to justify preferring any given admissible procedure to another. 
It is therefore of importance to introduce new criteria which will single out a 
procedure for use. It is the object of this paper to discuss some further principles 
which select a single statistical procedure from the class of all ‘“‘monotone’’ 
procedures. 

In the n = 2 action problem (essentially the testing problem) some of the 
classical principles used to determine a single admissible procedure for use are 
related to the concepts of unbiasedness, maximum likelihood, invariance, 
minimax, ete. These principles have received much attention and their justi- 
fication and relevance are well understood for the parametric testing problem. 
For a detailed analysis of these classical concepts in the case of two action 
problems when the underlying distributions are Pélya type, the reader is referred 
to [1]. Our present discussion deals with the extension and analysis of some of 
these principles to the n-action problem. In the sense that the estimation problem 
may be obtained as a limit of finite action problems, the ideas here shed further 
light on the estimation problem. 

The language and notation we use is that of the introduction of the previous 
paper [3]. However, a knowledge of the results of [8] is not necessary for an 
understanding of the present discussion aithough a reading of the introduction 
would more than provide sufficient familiarity with the terminology to be used 
here as well as a general background for Pélya type distributions. Henceforth, we 
assume that the notation of this manuscript is that of [3]. Nevertheless, for 
clarity of exposition, we review briefly some of the main quantities to be used. 

Let the distribution of the observed real random variable X (usually a sufficient 
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2 SAMUEL KARLIN 
statistic), depending on the unknown parameter w (w ¢ 2, an interval of the real 
line), have the form 
z 
(1) Pia, @) [ plé, w) du(€), 
J—« 
where the density p(£, w) possesses a monotone likelihood ratio (Pdélya type 2) 
and uw is a countably additive measure defined at least for the Borel field of sets 
containing the open subsets of the real line. Occasionally, we shall assume the 
stronger condition that the density is Pélya type 3. 
The main transformation property of Pélya type 2 densities used in our analysis 
is as follows: If g(a) changes sign at most once (say from negative to positive 
values), then 


hw) = | g(x) p(x, a) du(x) 


changes sign at most once. Moreover, if h(w) does indeed change signs, then it 
must change in the same direction as g, i.e., from negative to positive. For a 
thorough discussion of these properties the reader is referred to [2]. 

There are n possible actions, and L;(w) (¢ = 1, --+ , n) represents the measure 
of the loss when taking action 7 and w is the state of nature. We require that the 
set 

= (wis, wi) 


U . o 
where the w; satisfy 


0 


—o = wy < w < we < °° Kw, = ®. 
The set S; represents the set of w values where action 7 is favored if the state of 
nature were known. Also, we assume that L,(w) — Lj4:(w) has exactly one sign 
change which must occur at & . 

We shall assume throughout what follows that the loss functions L,;(w) and 
the density p(x, w) satisfy sufficient smoothness conditions to guarantee the 
existence of all integrals involving these quantities and to justify all differentia- 
tion operations. In most particular examples these smoothness requirements 
can be readily verified. 

A statistical procedure is an n-tuple 


o(x) = (¢i(z), +++ , en(X)), 


where ¢;(x) is interpreted as the probability of taking action 7 when observing z. 
A “monotone” procedure is characterized by a tuple 


(%1,%2, °° »%m-15A1, — » An-1) 


where 7; S 2% S +++ S %1-1,0 S A; S 1. Explicitly, when the z; are distinct, 
then 


(1 1t3 <¢ < xi, 
Ht < G4, ¢ > Zi, 
1A ut = 2. 


: 
\l—Auife = mA, 


¢i(r) = ¢ 
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and by definition 7» = —*, A» = 0,2, + 0, Ay 1. In the case where some 
of the x; coincide then appropriate changes in the form of the definition of ¢,(x) 
at the values x; must be made. If the measure pw of (1) has no atoms (jumps), 
then a monotone procedure is fully specified (up to equivalence almost every- 
where with respect to u) by the critical values (2, , ro, --+ , 2,1). For the sake 
of simplicity of exposition, we restrict ourselves henceforth to the case of a con- 
tinuous distribution. However, we remark in passing that all of the results of 
this pnper may be extended, subject to suitable modifications, to the general 
case where we allow the measure y to possess atoms. The risk corresponding to 


any given strategy ¢ (¢g1, ¢2,°** , ¢n) IS given by the expression 


- fn n 
(2 p(w, ¢) | p(x, w) < >> Lilw)e,(x) > du(z). 

The collection of all monotone procedures constitutes a complete class [4]. 
When the loss functions satisfy additional assumptions, then all non-degenerate 
monotone procedures are also admissible (3 ° 

The set of all monotone strategies IW form an nm — 1 dimensional family in the 
sense that they depend on the n — 1 critical values which determine the proce- 
dures. Our problem, in choosing a specific strategy from 3M, is in essence finding 
n — 1 conditions which will cut the class WM down to a unique member. Alter- 
natively, we could impose some global restrictions which also single out a mono- 
tone procedure. lor instance, if an a priori distribution of nature F(w) is known 
to be meaningful, then the Bayes procedure with respect to F determines a spe- 
cific monotone procedure. [See [3], [5].] The assumption of the existence of F is 
often hard to justify and appears contrived. 

Another global condition frequently followed is to choose a monotone mini- 
max procedure. However, minimax procedures are often very unreasonable on 
the basis of statistical intuition and there exists feeling that minimax philosophy 
is in general too conservative and unrealistic. Of course, modifications of the 
minimax principle lead to the so-called regret principles. Various complications 
appear also for the case of the criteria of minimax regret [6]. 

A third method for choosing a monotone procedure is inherent in the construc- 
tion of complete classes as introduced in [4]. Suppose that for a given problem 
there has been in use a common or accepted mode of action which is not a mono- 
tone procedure. Then, there exists at least one monotone procedure which im- 
proves everywhere on it for the decision problem of more than two actions. If 
the original procedure is described by an n-tuple of functions ¢ = (¢1, --+ , gn) 
then any monotone procedure ¢ = (¢:,-+:, ¢,) (and there is at least one) 
which satisfies 


[rein [S8@ - Dew ]anc 


j=l j=l 


0 for wo 


IV 


IIA 
£ 


\S£0 for w 2 w% 
improves on ¢. 
This method is constructive. That is, for any non-monotone procedure in use 


we can explicitly exhibit a monotone procedure which yields a smaller risk uni- 
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tormly for any choice of the state of nature w. The apparent disadvantage to this 
idea is that it involves only an improvement relative to a given non-monotone 
procedure and sheds no light on the intrinsic question of selecting a specific mono- 
tone procedure from the class SM. 

In this study we will analyze three principles of selecting a single monotone 
procedure from I. The first represents an extension of the maximum likelihood 
estimate to the circumstance of the n-action problem. The monotone test ob- 
tained in this case has a lot of intuitive appeal and will be referred to as the 
maximum likelihood procedure. 

The following section examines another approach called the principle of maxi- 
mum probabilities (abbreviated M.P.). This principle, as well as the maximum 
likelihood procedure, does not depend on the specific values of the loss functions 
but rather on the preference regions S; = (wis wi). Any other loss function satis- 
fying the properties of a monotone preference pattern and giving rise to the same 
preference sets S; will possess the same class of monotone procedures obeying the 
principle of M.P. 

The precise description of this principle is as follows: A decision procedure 
which is defined by an n-tuple of functions ¢ = (¢1, --- , ¢,) is said to have the 
property of maximum probabilities (¢ has M.P.) if for every 7 


(3) hi(w’) = hw”) for any w’ in S;,a” #S;, 
where 
(4) hw) = | ¢,(x) p(2, w) du(2x). 


For the case of two actions a procedure ¢ has the property of M.P. if and only if 
¢ is unbiased in the classical sense. Therefore, this principle may be considered 
to be a generalization to the case of actions of the concept of unbiasedness. 
The quantity h,(w) may be interpreted as the unconditional probability for the 
procedure ¢ of taking action 7 when the state of nature is w. The condition (3) 
states that h,(w) is larger when w is in S; than when w is outside S;. This last 
property is the reason for the name, principle of maximum probabilities. 

It will be shown that there always exist monotone procedures having the 
property of M.P. for the case of n < 5 actions. In fact, we shall exhibit a one 
parameter family of such procedures. When » > 5, in general there ceases to 
exist such monotone procedures. 

The final principle investigated is the principle of unbiasedness (in the sense 
of Lehmann [7]). A decision procedure ¢ is said to be risk unbiased with respect 
to the loss functions L; if FefL(w, ¢g(x))] = Eof{L(@, e(r))] for all w and 6, where 
E4(-) denotes the expected value given that the state of nature is 6, and 


L(w, ¢(x)) = Zz L(w)g;(z). 


For the case of two actions, this definition reduces to the usual concept of un- 
biasedness. This principle of unbiasedness differs from the principle of M.P. in 








POLYA TYPE DISTRIBUTIONS IV 


that the former depends in a very crucial way on the magnitudes of the loss 
functions while the latter depends only on the preference regions. We shall prove 
that if L,(w) = L,; for win S; and the L,; satisfy suitable assumptions, then there 
exists a unique admissible monotone procedure unbiased in the sense of Leh- 
mann. The method of proof of the existence will in effect be constructive. In 
general, risk unbiased procedures need not exist. 

ACKNOWLEDGEMENT. I wish to thank Mr. R. Miller for his help in the prep- 
aration of this manuscript. 


1. Maximum likelihood principle. We assume throughout this section that the 
density p(x, w) of (1) has a strict monotone likelihood ratio and further that 
p(x, w) possesses continuous second order partial derivatives. The fact that 
p is of Pélya type 2 implies (see [2]) that 

0 ( ; 
WI, Ww) — PI, w) 
] da I 
- a 
— p(z,w) —— plz, w) 
ax! dxrdw cone 
for all z and w. An additional assumption is imposed to the effect that the in- 
equality of (5) is strict for all x and w. Finally, we assume that for each x in X 
the equation 


6 e ( 0 
(ty) =— Zz, wo) = 
aa” 


has a unique solution, w = w(x), which is a differentiable function of x. These 
assumptions are not as stringent as may appear offhand. A wide class of distribu- 
tions, including the exponential family (p(z, w) = e**B(w)), the noncentral /, 
the noncentral x, ete., fulfills these requirements. For the exponential family, 
w(x) is the solution of the equation —8’(w)/8(w) = r. 

LEMMA 1. w(x) 2s a strictly increasing function of x. 

Proof. Differentiating Eq. (6) with respect to x leads to 

a p(x, w(x)) p(x, w(r)) 


(7) re ——-— w(x) = 0. 
' Aardw dw" 


By assumption, 


0 
p(x, w(x)) a0 p(x, w(xr)) 
wW 


| 


: > 0, 


a 
— pz, w(z)) —— pz, w(z)) 
oo 
which implies a p(x, w(x))/dcdw > O because of (6). Since p(x, w) assumes a 
maximum at w = a(x), dp(z; w(x))/dw < 0. Thus from (7), w’(x) > 0. 
As x varies over the sample space X, w(x) varies over the whole © interval. 
Suppose not; then there exists an w) such that wo is not the upper endpoint of 
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Q and for w > wo, Ap(a, w)/dw < O for all x. (Or similarly for the lower end 
of 2.) But this contradicts the fact that for all w €Q, 
a 


0 0 os 
| — p(x, w) dula) = — [ p(x, w) du(x) = 0. 


dna O}0 OW Jax 


Since w(x) is a 1 — 1 strictly monotonic mapping of X onto &, the inverse 
‘ * € : i 7? l 0 ; ree ‘ 
function w is well-defined. Set xj = w (wi),t = 1,---,2 — 1. The maximum 
likelihood principle dictates that the monotone — whic . should be used 


is the one defined by the critical numbers (1 5° °°, tn-1). For z € (24-1, 2;), take 
. 0 rr 
action 7,7 = 1,---,n,%9 = —* anda, TO. his calaaink has the feature 
that for any observed « the proper action 7 is taken whose corresponding interval 
0 0 


(wi1, wi) includes the maximum likelihood estimate of w. In less precise lan- 
guage, that action is taken whicl is most likely. 


2. Principle of maximum probabilities (M.P.). The principle of maximum 
probabilities is one type of extension of the concept of unbiasedness in hy errs 
testing. Consider the n-action problem defined by the points — = ° 
< +++ Sw, = + in which action 7 is preferred in the interval S 
A decision procedure which is defined by an n-tuple of functions ¢ ¢i,¢°° 
is said to have the property M.P. if for every 7, hi(w’) 2 hi(w”) for any w’ 


” 


w” g S;, where 


wy 


2 


hw) : | ¢i(x) p(2, w) du(a) 
— 3 


Our object is to try to establish the existence of monotone procedures possessing 
the property of M.P 

It is necessary in studying this concept to assume that the density p(x, w) is 
strictly Polya type 3, and that the equation dp(z, w)/dw = O is well-defined 
and has a unique solution w = w(.) for each value of x. For any constants a < b 
it is tacitly assumed that differentiation with respect to w is valid inside the 
integral sign of 


} 


| PO, w) du Zi 


Also, assume that yw is a continuous measure without discrete mass points whose 
spectrum is an interval. This last assumption is not essential but without it 
additional care must be taken in handling randomizations and the lack of unique- 
ness of various quantities caused by gaps in the spectrum. 

For the purpose of exposition our analysis is divided into a series of lemmas. 

A randomized strategy is now defined by n — 1 points (71, --- , «,-1). Let 
ui = 1,---,m — 2) be fixed for the moment and define (z;(@), x;4:(@)) by 
the equations 

Fi+l 


hias(w;) = P(x, ws) du(x) 
Jz; 


(8) 


Zi+l 
hiss(wess) | p(x, aaa) du(a) = Q. 


“24 
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(a;(a@), ®i4:(@)) are uniquely defined since by Theorem 3 of [1] there is a unique 
monotone strategy which improves on the non-monotone strategy ¢(x) = a. 
Moreover, it is clear that h,.;(@ 
8) is satisfied. 


, 


) > hiss(w”) for any w’ ¢ S;,, and w” g S;., when 


Lemma 2. x,(a) is a monotone decreasing and x;,;(a@) is a monotone increasing 
function of a. 
Proof. Krom (8), 


(9) | 


“2, (a) 


“Fi 41 (a) 


(p(x, wi) — plz, wi+1)} du(x) = 0 


for all a. Since p(x, w) is strictly Pélya type 3, p(x, wi) — p(x, wis) has at most 
one zero; by (9) it has at least one. In order that the relation (9) be preserved 
for all a, either x,(a) increases and 2;.;(@) decreases, or z;(a) decreases and 
Zi.1(@) increases, as a increases. It is clear from (8) that the latter must hold. 

It also follows from the variation diminishing properties of the density p(x, w) 
[2] that 


hy(w) / p(x, w) du(x) 
© a 
is a monotone decreasing function of w, and 
x 
h,(w | p(x, w) du(x 


“In-1 


is a monotone increasing function of w for any x, and z,_; respectively. 
‘ . : “ ° ( 
Consider r,(a) and 2,;4;(a@), which are defined by the equations h,.;fw;) = 


0 , , ° . 0 
@ = hAisslwiss), and x4.:(a) and x;,2(a@), which are defined by hjs2(wi,:) 
0 rr 
a = hys2(wi42). Then 
. ’ / , / , > 
Lemma 3. For all a, z(a) < vi4:(a) and r,4;(a) < xiacla), i = 1,--- nn — 3. 


Proof. Let 


] L, as. 2 &.%, 

ath 4 ° 

gs 0, otherwise. 

Suppose z,(a) 2 riss(a). Then / at a ee always of one sign 


t+1° £42 


or at worse changes sign from — to +. But 


{> 0 for w< wet 
(10 Wteziaa) — Te, 2’, s)p(z,@) du(z)) = 0 for w= wit 


0 


<0O for w> wix: 


\ 


which is an impossibility in that it changes sign in the wrong direction [2] so 


tila) < tia1(a). 

Suppose xj4:(a) = tizo(a). Then Itz;.2,.51 — Ii,’ 2’) is always of one sign 
which contradicts (10). 

As a — 1, x(a), ti41(a) — —2 and 2j4:(@), tis2(a) — + (or the ends of 


*x * ’ 
the spectrum of uw), and as a — 0, zi(a) > a7, tinila) — ai, and ri4:(a) > 
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te, Visela) > rt,. Lemma 5 below asserts that z¥ < xf, but first it is 
necessary to prove Lemma 4. 

Lemma 4. dp(at , w)/dw does not vanish at wi or wi4, but does vanish for some 
wt where oa <a ,;t=1,°---,n2— 2. 

Proof. By the mean value theorem for some wi (a) € [wt , ts), 


a eztini (a) | etj4i(a) 


|wanw * (a) ~ zy (a) OW 


Il 


for every ae [0, 1]. As a — 0, wi(a) — wt; dp(ai, wt)/dw = 0. Suppose wt 
w,. Then, ap(xi , w)/dw > O for w > wy which implies that p(27 , wi) < 
p(xt , wi+1). Since p(x, w) is continuous in each variable, there exists « > 0 such 
that p(z, wi) < p(x, 443) for all x satisfying | — xt < «. But this implies 
that for sufficiently small a, 


+2541 (a) e2i41 (a) 


p(x, wy) du(x) < | p(a, wiss) du(z), 


“zi (a) “2; (a) 


° . we ° + . * 0 . 

a contradiction of the definition of x(a) and 2;,;(@). Similarly, w| # w;),, . Thus, 
* 0 0 

wi € (we, Wi41). 


- * ~ . 
Lemma 5.2; < 21,27 = 1, --- ,n — 2. 
‘ ~ * ‘ * * TT - * * « 
Proof. By Lemma 3, xi S xi4: . Suppose zj = xi4: . Then, dp(ai , wi )/dw = 
és * * 0 0 , * 0 0 ms? a 
ap(xt , wie1)/Ow = O, where wi € (wi, wisi), Wier € (Wis, Wise), Which is im- 


possible by assumption. 

This lemma can now be utilized to construct decision procedures possessing 
the property of M.P. For the 2-action problem any monotone procedure (defined 
by a single number 2) is unbiased. In the 3-action problem each monotone pro- 
cedure (x; , 22) which satisfies ho(w}) = a = he(w2) for some a ¢ {0, 1] is unbiased. 
This means the monotone M.P. procedures are a one parameter family since 


once x; is specified as possible, x. and @ are determined. For n = 4 consider 
0 0 ’ “ . 

Zi(a;), e(a,) defined by he(wi) = a: = he(we) and x2(ae), x3(a2) defined by 

h3(w2) = a2 = hj(w$), where a and a: are chosen small enough to insure that 


ae(ay) < 22(a2). By Lemma 5 this is possible. Increase a, and a until r2(a;) = 
x2(a). The monotone procedure defined by (x;(a1), x2(a1), 13(az)) has the property 
of M.P. Again the monotone M.P. procedures form a one parameter family 
sinceany point y ¢ (zt , 22) will determine a; and a» by the condition that z2(a:) = 
y = 22(a2). 

For the case of 5 actions the same method of construction is employed and a 
one parameter family of monotone M.P. decision procedures is designated. 
Define 

a,(ay), X2(a1) by he(w:) = a, = he(ws), 


12(e), a3(a) by hy(w2) = a» = h;(w3), 
3 (as), a4 (as) by hy(ws) = a = hy(ws), 


. / / ” 
where a; , @ , a; are chosen so small that x2(a,:) < xe(a2) and x3(a2) < 23 (ay). 
e / / ” 
Increase a, and a; until re(a1) = 22(a2) and 23(a2) = 2x3(a3). The monotone pro- 
/ ” ° oge,° 
cedure (2;(a@1), X2(a;), %3(@2), 24 (a3)) has the property of maximum probabilities. 
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The family has only one parameter since the point y «(zt , 22) determines 
@ , a, and a; through the relation z2(a,;) = y = 12(ae). (Note that some values 
of y in the interval may not be legitimate parameter points. This will happen 
when the condition y = 12(az) is satisfied by an a» for which 13(a2) > 23.) 

When n = 6, the reader may verify that this method of construction breaks 
down. The difficulty is that z,(a) does not have to decrease at the same rate at 
which 2z;,:(a) increases. It may not be possible to choose a2 and a, such that 
13(a2) = 13 (a) and still have zt < 22(a2) and 24 (a;) < a. 

For the cases n = 3, 4, and 5, note what has been accomplished by introducing 
the principle of M.P. The statistician, instead of having to choose a procedure 
from the class of all monotone procedures which is defined by n — 1 parameters, 
has only to choose from a class of procedures defined by only one parameter, 
those monotone procedures which have the additional property of maximum 
probabilities. 

If the unknown parameter occurs in the density in the form of a translation 
parameter, that is p(t, w) = p(— — w), du(t) = dt, and p(-) is a symmetric 
function with respect to the origin, then any monotone procedure ¢’ defined by 
the critical numbers 7; < r2 < +--+ < 7,_; such that 

a+ Zu we + west gh , ‘ 

—————— = — — fori = 1,2,---,n—2 
satisfies the property of M.P. The proof of this statement is straightforward and 
is omitted. 


3. Unbiasedness in the sense of Lehmann— A decision procedure ¢(z) is said 
to be unbiased (in the sense of Lehmann or risk unbiased) if 


(11) EL (w, ¢(x))] = FeolL(6), ¢(x))] 


for all w and 6, where E¢(-) represents the expected value given that the state 
of nature is 6. By specializing the loss function L(w, a), it can be readily verified 
that this general definition of unbiasedness reduces to some of the classical 
notions. For a full discussion of the significance of this concept, the reader is 
referred to [7]. 

We search in this analysis to discover when unbiased procedures exist within 
the class of monotone procedures for the case of multiaction problems. An effec- 
tive method of explicit construction of such procedures would also be desirable. 
Unfortunately, in general unbiased procedure need not exist. However, Theorem 
1 below provides an affirmative answer for a substantial class of loss functions 
satisfying assumptions (a) and (b). 

It should be emphasized that in contrast to the principle of M.P., which also 
embodies a generalization of the notion of unbiasedness in testing hypotheses, 
the present extension involves the specific loss functions in a fundamental way, 


; > ie + 0 
(a) Ljw) = Lj; for all w in S; = (@;_1, 3] 


}s 


° 
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Let Li; — Liss; = ai,;. 


(b) 0 = aa 2 az 2 = a;; and a;; < 0; 
ign 2 Opiee 2 °° Qn S Oand a;54, > Ofort = 1,2,---,n— 1. 
— aj; j= 1, oe 
Let b;; = f= ] n—1 
aij j 1+ 1, in 
Forj si,;k 2%1+2 t=] n— 1, 
‘a i bia lio 
ic ica h : = . 


Two important examples of decision problems whose loss functions satisfy 
conditions (b) and (c) are worth noting. 


(I) L,w) =cli-—j for win S;. 
This case is referred to as the discrete absolute error loss function. 


0 wesS;, 
(II) Lj(w) =< 
c A Arey 
The second example corresponds to the case where one assigns a constant loss 
c for any error and zero loss for a correct decision. 

The fact that, if it exists, the monotone unbiased procedure is unique lends 
greater significance to this principle. 

Examples I and II above are special cases of loss structures having the form 
Li; = f(\i-— Jj.) = Li_;, . Loss structures of this general pattern possess con- 
siderable interest since many practical problems arise in which the incurred 


losses can be assumed to be proportional to the magnitude of the error and unre- 


lated to the type of error. In the event that L;; = L,),_;, (we say L,; has a con- 
volution form), condition (b) implies that L is a concave function of |t — j/, 
ie., Leas = 3(L, + Lae), r = 0,1, +--+ ,n — 2. This is to say the loss increases 


concavely as the action actually taken diverges from the correct action. That 
concavity implies condition (b) is also true, so condition (b) is fully equivalent 
to the concavity of L,;_;, as a function of i — j |. Moreover, condition (c) is 
automatically satisfied if ZL); ;, is concave since b;; 2 bi4;,; for 7 S iandb;, S 
bisix for & 2 « + 2. Therefore, for this convolution case, the hypotheses of 
Theorem | are equivalent to the statement that Z,;_;, is a concave function of 
its argument. 

It should be noted that condition (c) is not the same as condition (IT) of [3] 
However, in the important case L;; = L,,;_;, , the two conditions are equivalent 
since the two b;; matrices are identical. Consequently, when the loss function 
L 


ij = Lyi_;) is concave, all non-degenerate monotone procedures are admissible. 
In particular, the unique unbiased procedure guaranteed by Theorem 1 which 
is also shown to be non-degenerate (Corollary 4) is necessarily admissible in the 
case where ;; is of the convolution form. 
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(The proof of Theorem 2 of [3] is easily seen to apply in the case of loss func- 
tions of convolution form satisfying (b) and (c), above.) 

The principle theorem concerning unbiased procedures is the following: 

THEOREM 1. Jf assumptions (a), (b), and (c) are satisfied, then there exists a 
unique monotone procedure which is unbiased in the sense of Lehmann. 

To avoid inessential tedious details we assume that p(z, w) is strictly Pdélya 
type 2, and yw is a continuous measure whose spectrum is an interval. The anal- 
ogous results when the assumption on yu is relaxed are immediate. 

The proof of Theorem 1 is more elaborate and will be presented in Sec. 4. We 
dwell in this section on the important speciai case of (1) where the proofs are 
considerably simpler and for which some additional results are obtained 
(Theorem 2). 

Proof of Theorem 1 for the special case (1). For a monotone procedure 
(11, °** , Xn-1) define 


Ailw) = ec [ p(x, w) du(x) + 2c | p(x, w) du(x) 
“21 “ze 
+---+(n—l)e p(x, w) d(x) 
“tn-1 
rz. Z3 
Ao(w) c| p(z,w) du(z) +c | p(x, w) du(z) 
+---+(n — 2)e / p(x, w) du(zx) 
“2n-1 
tis , ‘ , on 
A,(w) = (n — Ie p(x, w) du(x) + (n — 2)e p(x, w) du(zx) 
+2 “ry 
a | ; 
+.+---+¢ p(x, w) du(z). 
“2n-2 
For we S,,7 = 1,---, 7, p(w, ¢) = Ai(w). Define 
Bw) = Ailw) — Aisi(w), ¢=1,---,n-—1. 
It is immediate that 
Bw) = —c/ p(x, w) du(x) +e] plz, w) dy(z), i=1,---,n—1. 


In order that the monotone procedure be unbiased it is necessary and sufficient 
that Bw) 2 0,7 = 1,---,i-— 1; Bw) £ 0,7 = 1,---,n — lforwe §;, 
i = 1,-+-,n — 1. Choose the unique x; = 2} which satisfies B,(w;) = 0. Then 
B,(w) (S$) 0 for all w ($) wt. Since z; = x} fori = 2,---,n — 1, Bifw) <0 
forw < wi,i = 2,--+,n — 1. Unbiasedness further requires that for we S 
B,(w) 2 Oand Bw) S Ofori = 2,--- ,n — 1. Determine the unique z. = x 
such that Bo(as) = 0. rs > x? since w > w, and Bw) < 0 for we S. andi = 


worn 
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2,-:-,m — 1. The continuation of this construction will produce the unique 
monotone unbiased procedure (xj , «++ , 2,-1). 

For the special loss function under consideration this unique unbiased proce- 
dure is uniformly most powerful within the class of all unbiased procedures. 
This is the substance of the following theorem which is a special case of Theorem 2 
of [8]. The proof is included by merit of its simplicity and because it also illus- 
trates on a small scale some of the ideas necessary in carrying out the argu- 
ments of Theorem 1. 

THEOREM 2. Jf Li(w) = c|t — j| for w in S;, then any unbiased procedure 
¢ = (¢1,°°° 5 Gn) 28 everywhere aaa ed upon by the unique monotone unbiased 
procedure, except possibly at ws ghee ve w, 1: 

Proof. By definition, 


By(w) = Ay(w) — Ao(w) -ef- gi(x) p(x, w) du(x) 
+c [ (1 — gi(x)) p(x, w) du(x) 
=c — 2 [ ¢i(x) p(x, w) du(zx), 
and for k = 2,---,n— 1, 


B,(w) = Ay(w) — Aguilw) = ¢ — 2c | fei(z) + --- + ¢(x)|p(z, w) du(x). 


Consider any other decision procedure ¢* which is not necessarily unbiased. 
pork =.1,.<--,2 — i, 


BE(w) — BE"(w) = 2 | Uet@) + --- + of) 


— (gi(z) +--+ + oy(x)| p(x, w) dp(z). 


If ¢* is the monotone procedure constructed so that it improves upon ¢ according 
to Lemma 4 of [4], then ¢* satisfies 


0 forw Sw 
BF ( (w) —_ Bf *(w) 


~ 0 
\S 0 forw > w% 


fork = 1,---,n — 1. But Bf(w) = 0. Therefore, Bf*(w,)= 0 which implies 
that ¢* is unbiased. Since there is only one monotone unbiased procedure, ¢* 
must be identical with the ¢° of Theorem t. 

The limiting case of an n-action problem as n — + ~ is an estimation problem. 
Suppose that for the problem under consideration the limit is taken in such a 
manner that as n — 2, wh — — 2 ae 47 +20, las — wi | > 0, ¢=2,---, 
n — 1, and L,(w, 7,(a)) — c\|a— w|, where 7,(a) is defined by ae ies, The 
resulting problem is an estimation udlilians with absolute error loss function. It 
is easily verified that the estimate 6(x), which is the limit of the unique monotone 
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procedures which are unbiased in the sense of Lehmann, is defined by the relation 
(12) [ vtu, 82) duty) = | ply, (2) duly). 


This, of course, states the well-known fact that the median unbiased estimate 
of 6 is the function 6(z) which satisfies (12) when z is observed. 

4. Proof of Theorem 1. For purposes of clarity the proof of the theorem is 
divided into a series of separate steps. First, we introduce the relevant quantities 
entering into the analysis. For a procedure ¢ = (¢; , g2, -** , Gn), let 


AfWw) = La | ¢ilz)p(z, w) du(z) + --- + ln | ¢n(x) p(x, w)du(z) 


for i = 1,---, n. When w ranges over S; the function A?(w) coincides with 
p(w, ¢), the expected risk. Also for 7 = 1, 2,:--,n — 1, we define 
Bi(w) = Af(w) — Afu(e) 


aa [_ ete, w) du(x) + --- + ain | ¢n(x) p(x, w) du(z) 


(13) 


—ba | ¢ilz)p(z, w) du(x) — --- — by | g(x) plz, w) dp(z) 


+ bast | earla)ple, 0) dulz) + ++ + bia f eale)p(z, «) dula). 


If a decision procedure ¢ satisfies the system of inequalities 


Bf(w) = 0 lsksi-1! 


(14) and win S,;, 
Bf(w) < 0 ssksn-— il 


IIA 


then ¢ is clearly unbiased in the sense of Lehmann. In general, the converse is 
not valid. However, it is true that for monotone procedures the property of 
unbiasednesss implies that this system of inequalities is satisfied. The inequalities 
are fulfilled for a monotone procedure ¢ = (2; , 22, °°* , Zn-1) if and only if 


(15) Bf (wi) = 0, je t%--..a—i, 


In fact, the variation diminishing properties of the density p(x, w) imply that 
Bf(w) < 0 for w < w; and Bf(w) > 0 for w > wy which in turn are equivalent to 
the system of inequalities (14). Our problem reduces to the demonstration of the 
existence and uniqueness of a set of values x = (2, 22, -+- , Xn-1) Where x, S 
Te S +++ S 2,4 which are a solution to the system of non-linear equations: 


Bie) = —ba [ple 0%) dul) — +++ — bu | wo) dul 


(16) ha - 
+ Dy cps i pié, w) du(é) + heals: + Din i plé, wy) du() _ 0. 
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Turning to this task we start by showing that the mapping « — y which is 
defined coordinate-wise by y; = Bi(wi), i = 1, ---,m — 1, and which maps the 
n — 1 dimensional simplex of alln — 1 tuples x = (21, 2%, +++ , X,-1) satisfying 
a S x S +++ SF x, into Euclidean n — 1 dimensional space (E"™") is a one- 
to-one mapping. Precisely: 

Lemma 6. The mapping y; = Bi(wi), i = 1, +++ ,n — 1, defined on the set of all 
monotone procedures by means of the formulas (16) with image in E"™ space is a 
one-to-one transformation. 


Proof (by contradiction). Suppose there exist two different monotone procedures 
go ~2z = (%1,%2,°°* , tes) andy’ ~ x’ = (ary Xs oe Ie 1) With the property 
that Bz (w:) — Biwi) = 0 fori = 1,---,n — 1. Without loss of generality 
assume 2; > 2,. Bi (wt) — Biwi) = 0,2 = 1,---,n —1, yields the system 
of equations 


<i 2 
0 — (br + bie) | P(x, w}) du(x) + (bi — dys) | p(x, w1) du(x) 
Jay Jes 


eTn—1 


tees + (Or naa — Din) p(x, wi) du(x 
| a 
i re 


= (be — be) p(x, ws) du(x) — (bee + bes) p(z, w2) du(x) 


rrn—1 


+ (D2n-1 — ben) | p(x, 0) dulx 


“Zn-1 


, 


“Z) 


(bp-1,2 imal b,-1.1) | P(r, theca) d(x) 


“2 


rIn—2 
+ ae + Rint acd i ba—t,n—2) p(z, wn—1) du(x) 


| 
i ee | p(x, Gana) du(x). 
“Zn-1 
Since (by + by) > (bi — big + +--+ + (Oi.n-1 — brn), it follows that there exists 
ak,1< k < n — 1, such that 


a 


Tk 
| p(x, 1) du(x) < / p(x, wi) du(z) 


“zl 7k 


for 1 <1 < k. If k is not unique, choose the largest k which satisfies this prop- 
erty. Consider the kth equation. For 1 S$ l < k, 


| 


’ 
pt} 


ork 
p(x, we) du(x) < p(x, wz) du(x) 


zr! “2k 


by the fundamental change of sign property for strictly Pélya type 2 densities 
. / / 
since 7; < Tk and Zi < Ze But (bi. + be.k-41) = (bye — bys) Sf eee +H (Dex 7 be.-—1) 
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+ (bi441 — be nege) Heee + (Den b,.n). Therefore on examination of the 


kth equation, if k <n — 1, there exists an h > k for which 


oz) zi 
/ 9 0 
p(x, wi) du(x) < | p(x, we) du(x) 
Faw 


“z1 


for 1 <1 < h. If h is not unique, choose the largest h. 
Continue this argument until at the last step it has been established that 


pri za-l bl 
; wie. w,—1) du(x) < | p(x, weal du(x) 
*23 *Zn—1 
for 1 <1 <n — 1. But this contradicts the fact that BZ_,(w%_,) — Bx_,(w2_1) = 
0 since (Dn-1.n-1 + Dn-t.n) > (On-1.2 — On-ra) + °° + (On-t.n-1 — Da-t.n—2)- 


CorROoLLARY 1. There exists at most one monotone unbiased procedure. 
The proof is immediate. We shall need the following slight extension 


of Lemma 6. 
-, 2n-1) and ¢! ~ 


Corollary 2. If ¢ is the monotone procedure x = (x, %2, °° 
y= (x; . vs + tn 1) with In 1 = In-1 and Bz (w}) — Bilw:) = 0 fori = 
1,2, --,n—1, thenz; = t:fort = 1,2,---,n—1. 

The proof of Corollary 2 is essentially a paraphrase of that of Lemma 6. We 
sketea the details. Let k be the first index where x, > 2, (k < n — 1). By ex- 
amining the ‘th relation BZ (we) — Bi(wr) > 0 as in the proof of the lemma, 
when k < n — 1, we may find a larger index h > k such that for? < h, 


, 


i 0 zh 0 
/ > ) - s /e y »\ 
plé, we) dult) < | p(t, we) du(é). 


“2 


-z 


“ZA 


From the variation diminishing properties of p(t, w) we may conclude that 


fori < h, 

“2% rh 
| plé, wr) du(t) < | pcg, wr) dy(é). 
zy “2h 


On continued inspection of the Ath relation, we find a larger index until we 


reach the (n — 1) index with the property that 


| p(g ws) dult) << | pl(E, wa), dul), i=1,2,---,n- 


“24 “fn-1 


The last inequality 
Bu-i(wn1) — Balen.) = 0 


is evidently contradicted. 


One final extension in the same direction is the following: 
° , , , 
» Uns) and gy’ ~ (21,22, °** Tea, V, °°", Y) 


Coroutiary 3. If ¢ ~ (41,22, ° 
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are two monotone procedures such that Bf’ (wi) — Bf(w:) = 0 fori = 1,-- 
k — landy = 2n-1, then 

BE’ (wr) — BE(we) S 0. 


The proof follows the same line of reasoning as the preceding. 
In view of Corollary 1 it remains to prove the existence part of Theorem 1. 
We require the following lemma. 


Lemma 7. Let the 2 X m matrix (e;;), i = 1, 2,7 = 1,--+ , m, consist of non- 
negative elements, and let 1 , --~ , \m be non-negative constants. Let condition (E) 
be satisfied: 

1; ex | 
(E) ; | > 0 
Cay Oe 





frlsjsll+2skem1f0 <eyit--: tei S Grpdiet --- + 
€1,mAm, then ends + +++ + Carr S C2.rgedt42 + °°* + ComAm- 

Proof. By (E), Dsja: (exxe1j — e1eas)A; = 0 for k => 1 + 2. Therefore, ex Do}-1 
€1jAj = Cue int €2jdj, ANd Demise Caxrx° iat ajAj 2 D melse ews Djat €2;A; . 
For 0 < Pius aijAj S p » CA » tat ex, = Laat €2jA; - 

Proof of existence. It suffices to show there exists a monetone ¢ for which 
Bf (w:) = 0,7 = 1,---,n — 1. This holds trivially for n = 2. Suppose it is 
true for the case of n actions. The argument is inductive. For n + 1 actions 
and a monotone procedure, let 


BP (w) = —ba [ p(z, w) dp(zx) — see =i; p(z, w) dy(z) 


ti-1 


rZi4n ed 
+ disor | ple, e) du(z) + ++ + Bim | p(x, @) du(z) 
fort = 1,---,%. 

(1) Choose z, = «. The conditions (a), (b), and (c) are fulfilled so by the 
induction hypothesis there exists a solution g® ~ (zt, ++: , 2a-1, ©) of the 
system of equations Bf(w;) = 0,7 = 1, --- ,n — 1. For this solution obviously 
Bi(wa) < 0. 

(2) Choose z,1 = z,. By the induction hypothesis there exists a solution 
eo ~ (xi, --+, co = 2, 2h = 2°) of B&(wt) = 0,7 = 1,---,n — 1. Since 
a (w,-1) = 0, the variation diminishing properties of densities possessing a 
strict monotone likelihood ratio lead to the conclusion that . (w,) = 0. If 


Br (w.) = 0, then it follows that z = -—o which in turn implies that 
B%, (wn) > 0. On the other hand, if BZ_; (w,) > 0, let l = n — 1,m = n, e; = 
ba1,j forj = 1,---,m — 1, Cin = Dns.n4i, C23 = bn; forj = 1,---,n — land 


€on = bn.n41 in Lemma 7. Then, by Lemma 7 . (w.) > 0 implies x (wn) = 0. 

It has been shown thus far that there exists a strategy (zf,--- , ra1, ©) 
such that BY (wi) = 0,7 = 1,---,n — 1, and Bf (w,) S O anda strategy 
(ai, +++, 2n-1 = 2, Zn = 2) such that BY (wy) = 0,7 = 1,---,n — 1 and 
BE (ws) = 0. If it can be shown that for every z, satisfying x <2, < © there 
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exists a solution (2; ,--+ , Zn-1) to BY (wi) = O71 = 1, --- ,n — 1, then by con- 
tinuity a solution exists satisfying Bf (wi) = 0,7 = 1, --- , n; the continuity of 
the solution as a function of z, being a simple consequence of Lemma 6. 

The proof that for every z, 2° < z < , there exist (x,(z), --- , 2,-1(z)) such 
that ¢ ~ (2;(z), «++ , Zn-1(z), 2) satisfies Bf (w;) = 0,7 = 1,---,n — 1, pro- 
ceeds in a stepwise manner. 

(3) Let zr) S te = +++ = Zan = Z. 

(a) Choose z; = 2. Since by = by 2 --- = ding, 


mu [* p(ae8) dua) + dress |, plz, 02) dul) 
a 


lA 
oO 


which implies 


IIA 
o 


—bu | p(x, ot) du(z) + diets | plz, 0%) du(z) 


. 0 0 
since 7} S 2, < Zz. 


(b) Choose 7} = —®. 


be [ p(x, w) du(z) + bins | p(x, wr) dy(x) = 0. 


(c) Thus by continuity there must exist an zi = 21(z) which satisfies 


—bu [ , p(z, w) du(z) + by [ p(z, w}) dy(z) + Di ast [ p(z, w) du(z) = 0. 


(4) Let 2, S xe S 2; = +++ = Zn. = 2. Consider the two expressions 


2} 22 
C1(w: 2, 22) = —bn | p(x, w) du(x) + be [ p(x, w) du(zx) 


71 


+ brs | p(2,«) du(z) + dren | ple, 0) du(2), 
¢2(w; 21, 22) = —bn [ p(x, w) dp(x) — be | p(x, w) du(x) 


+ be | p(x.) du(a) + been | pl2,a) dale). 


Of course c;(w, 2:, 22) = Bf(w), 7 = 1, 2, for the special procedure 
¢ ~ (1,22, 2, +++ z). Our immediate object now is to show that x, and 2, exist 
satisfying (r, S z2 S z) such that e(w) 321, 22) = O and c(w2 ; 2, 22) = 0. 

. ° , : 
(a) Choose zr. = z. By (3) above there exists an 2;(z) for which 
(wr : 24(z), z) = 0. 
We assert that co(w ; 21(z), z) Ss 0. Comparing for i = 1, 2 BE’ (wt) and 
0 0 


0 0 0 © 0 ‘ 
Bf(w) where ¢ ~ (21, 22, °°: , Zana = 2, 2a = 2) of (2) above and 
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e ~ (x1(2), 2, Z, «++ , 2) With z > 2°, we see the conditions of Corollary 3 are met 
and therefore we may conclude Co(ws , x1(2), z) S O as stated. 

(b) Choose x; = 22 ; then (or; —*, —%) = Oand ¢,(w; ; z, 2) S 0 by 
(3a). Thus there exists a u = 2; = 2x2 such that ¢;(w; ; u, uv) = 0 which implies 
ei(we 3 u, u) = O If ey(wr ; wu, = 0, then « = — « which in turn implies 
cows > u,u) > 0. If in the other circumstance ¢;(«> ;u, u) > 0, then by Lemma 
7 we infer that co(w: ; uv, u) = 0. 

We next prove that there exists an r= at y) such that ¢(wi ; — y) = 
0 for every u < y < z. (This is like the larger problem we are trying to solve for 
the special case when n = 2. The quantity z plays the role of ~« and u adopts the 
role of z.) When x; = y, o(w) ; y, y) < 0 beeause ¢;(w} ; u, w) Oand y > u. 
Obviously ¢(#: ; — ~, y) > 0. By continuity there exists an «} such that 
a(w sat, y) = 0. 
Since ¢,(w) ; 21, y) = 0 has a gry r} for every y in the interval {u, 2] and 
Ce ort : x3(z), z) < 0, co(we ; u, vu) = 0, by continuity there must exist an 23(z) = 


0 ° 


€ [u, z] and ni) such that ¢)(w) : i, 23) = Co(ws :2x1,,2%3) = O. 


, 


(5) Letm Same S287 --+ = zg, Consider the three expressions 


D,(w; 1, 2, 23) = | p(a,«) du(a) 


j=l “2j-1 


3 pz; ez 
a = bi; | p(x, w) du(x) + dig | p(x, w) du(x) 
Ja 


j=it+l “Zj-1 


ee 


+ Binet | p(x, w) dy(x), 


2 


t = 1, 2, 3, where x = — oo. Of course D,(w; 2 , 72, 73) = Bi(w) where ¢ ~ 
(a1, %2, Xs, Z, 2,°-*+ , 2). The next step is to try to solve Dot + 21. Xo, 3) 
0,2 = 1, 2, 3. 

(a) Choose x; . By (4) above there exists a couple (x i(z), 4 wa i such that 
Dior ; ; ri(z), x3(z 2 z) = Dolws : xilz ), r2(z), z) = 0. Corollary 3 may be — 
and we find that on comparison with the rel: itions B?*(«;) = 0, i 1, 2, 3, for 

~ (x3 ‘ xe _ eee g r?) of (2), D3( (ws - xi(z) 3(z), = @. 

(b) Choose x2 = 23. By (4) there exists a mara (%,(w), w) where x2 = 
xz; = w to the equations Dy(wr : 21, eo, 2: , Do(wr : Li» Sa, Me) = V. 
D:(w3 3%, w, w) = Oisa eee of bieia? t. 

(c) There exists a couple (xf*(y), x7*(y)) such that 


D,(wi ; 2I*(y), 22*(y), y) = aad ; ai*(y), 22° (y), y) 


for every y € [w, 2]. 
The proof of this step requires a repetition of the previous arguments as carried 
out for the function c; with y taking the role of «. To this end, we establish 
(c.1) Choose 2; = x2. For 27; = —«, ng ,—«,—o,y) > 0. Forx2 = 
Y, Dy(ws 3Y¥,Y, Y¥) = Osince Dy(w4 :Z,(w), w, w) = 0 implies Dias -w,uw,w) £0 
and y > w. Therefore, there exists 2; = v ‘uch ‘th it Di(w) 3 v, v, y) = 0. It can 
be shown by applying Lemma 7 that D2(w2 ; v, v, y) = 0. 
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(c.2) Choose x2 = y. Di(w: ; — *, y, y) = Oand D,(w; ;¥, ¥, y) = 0. Thus, 
there exists 2,(y) such that D,(w} ; xi(y), y, y) = 0. Delo ; aly), y, y) S 9. 
The last inequality may be deduced from Corollary 3 by comparing the pro- 
cedures ¢’ ~ (x,(y), y, y, 2, 2,°°* , 2z) and g ~ (4,(w), w, w, z, z,--* , 2). 

In fact, suppose the inequality D2(w: ; x:(y), y, y) S 0 is violated. Consider the 
solution (Z,(w), w, w) to the system of equations 


Dy(wr 5 1, T2, 42) = Dew; 11, t2, %2) = O. 


Di(wi ; ry), y, ¥) — Dy(wt ; %(w), w, w) 


(17) z,(y) ; 3 ey 
= —(bu + dy) [ p(x, 1) du(x) + (dis — dry) | p(x, w;) du(x) = 0, 
~2,(w) “w 
De(ws : nty),y,y) — D2(w2 ; E(w), w, w) 
(18) ry 


2, (y) 1 - 
= (be — be) [ p(x, we) du(x) — (bes + be) | p(x, w2) du(x) > 0. 


~2;(w) “wu 


Eq. (17) implies f% p(x, 3) du(x) > S22) p(x, w}) du(x), but this contradicts 
(18). Therefore, D2(w: ; a:(y), y, y) S 0. 


(c.3) For every 22 €[v, y], Di(wl ; 21, 22, y) = 0 hasa solution. By con- 
tinuity, then, there exists an ai (y), 22 (y) such that D,(w; ; 23*(y), x2 *(y), y) = 
De(w2 ; 21" (y), x2" (y), y) = 0. 

(a), (b), and (c) of (5) show that there exists a 3-tuple (x3 (z), r3(z), 13(z)) 
which satisfies D\(w? ; x} , x3 , x3) = 0,7 = 1, 2,3. 
The steps for 7; S v2 S 23 S x, S x5 = «++ = z utilize the same principles as 


those employed above. The general pattern should now be clear to the reader. 
The next step would consider the four functions Ey(w; 21, 22, %3, %) = 


1\ - 


Bf(w),i = 1,--+,4 where g ~ (2, 2, 43, 24, 2, 2, +++ , 2). It is necessary to 


show that E,(w?) = 0,7 = 1, 2, 3, 4, have a solution in 2 , re, 2; , and x,. This 
entails repeating the entire preceding argument for the case of one, two, and three 
functions in each case using a suitable comparison monotone procedure. We 
sketch the argument. Setting 7, = z we obtain by (5) that there exists a tuple 
(a;(z), xo(z), 23(2), 2) for which E(w: 5 21(z), 22(z), 23(z), 2) = O fort = 1, 2, 3. 
Corollary 3 may be applied by using the second procedure ¢’ ~ (x} , 22, «++ , 2m) 
to show that Ey(w4 ; 21(z), x2(z), 23(z), z) S 0. Next put 2; = 2, = ¢ < z and 
again by (5) we obtain a tuple (z,(t), r2(t), t, t) for which Eat 3 2(t), 2,(t), t,#) = 
0 for 7 = 1, 2, 3. According to Lemma 7, Elot x(t), xre(t), t, t) = 0. Given y, 
t < y < z, it would be enough to construct a solution to E(w: 321, 22,23, y) = 
0,7 = 1, 2, 3, for then by continuity there would exist a solution to E,(wi) = 
0,7 = 1, 2, 3, 4. The analysis of E (ws 3%1,%2,%3, y),t = 1, 2, 3, is similar to 
the arguments of (5) this time using the comparison procedure 


g ~ (a(t), re(t), t, t, 2, 2,°°° , 2) 


as ¢ ~ (Z,(w), w, w, z, «++ , z) was used in (5). For the final step we repeat this 
sequence of arguments n — 1 times. This completes the proof of Theorem 1. 
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Coro.uary 4. The unique monotone unbiased procedure defined by Theorem | is 
non-degenerate. 

Remark. Since the density p(x, w) is assumed to have a strict monotone likeli- 
hood ratio, the set co. = {x| p(x, w) > 0} is independent of w [4]. The concept of 
an interval (z; , x41) being degenerate should therefore be understood as taken 
with respect to du(zx). 

Proof. Suppose the unique unbiased procedure a = (t0, 21,225 °°* 5 La-15 In) 
where zr = — © and 2, = + © possesses a degenerate interval. We shall prove 
that this assumption leads to an absurdity. First, observe that (zo , 21) must be 
non-degenerate. Otherwise, let jo be such that (x;, , 2j,11) is the first non-de- 
generate interval and jo = 1. By condition (b), Bj.-:(w3,-2) > 0, which con- 
tradicts the definition of x’. Now let io be the earliest interval where (xj, , 2i,+1) is 
degenerate. Therefore by what has been established 7 = 1 and also ip < n — 1 
for in the contrary case ae hone) would be negative, Let ko denote the smallest 
index larger than % for which (z;, , 2,41) is non-degenerate. A value of k must 
exist, for otherwise Bi, (wy) < 0. 

The strict monotone likelihood ratio possessed by p(x, w) implies that 


I - p(E, wi) du(E) | pls, wt) du(e) 
(+) Zi+1 tr+1 
= , ws) , wy Iu(é) 
t plé, a? ue) p(t, w®) dulé 


for every 7 < and r = ko with strict inequality valid for 7 = to — landr = ky. 
Equation (*) in conjunction with conditions (b) and (c) and Bj,(wi,) = 0 readily 
leads to the result 


Bi, (wt) > 0, 


which is impossible. This completes the proof. 

In any special case this construction is considerably more facile than the 
general proof shows. We carry this out for the special case whose loss function is 
(II) of the preceding section. For any prescribed 2;, < x; a value 244:(ti4, > 24) 
is determined recursively, whenever possible, by 


(19) [pl o%) du) = [ple 09) dule) 


“Zj- zi 


fori = 1, 2,---,m — 1 where x = —~. For 2, sufficiently near — =~, it is 
possible to solve (19) for each x; such that z; > 2x; and each is near —~. 
Allowing 2x; to increase, we observe that each zx; increases; and ultimately for 
zr; < ©, 2, reaches ~. Let 27 be the solution of (19) where r= = +. The 
procedure ¢* ~ (z} , 272, °°: , r*_,) is the unique monotone unbiased procedure 
for the case where 


) S; 
—* ‘6 aie 
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THE USE OF GROUP DIVISIBLE DESIGNS FOR CONFOUNDED 
ASYMMETRICAL FACTORIAL ARRANGEMENTS! 


By MarvIN ZELEN 


National Bureau of Standards 


1. Introduction and summary. A factorial experiment involving m factors 
such that the ith factor has m, levels is termed an asymmetrical factorial de- 
sign. If the number of levels is equal to one another the experiment is termed a 
symmetric factorial experiment. When the block size of the experiment permits 
only a sub-set of the factorial combinations to be assigned to the experimental 
units within a block, resort is made to the theory of confounding. With respect 
to symmetric factorial designs, the theory of confounding has been highly de- 
veloped by Bose [1], Bose and Kishen [4], and Fisher [11], [12]. An excellent 
summary of the results of this research appears in Kempthorne [13]. However, 
these researches are closely related to Galois field theory resulting in (i) only 
symmetric factorial designs being incorporated into the current theory of con- 
founding; (ii) the common level must be a prime (or power of a prime) number; 
and (iii) the block size must be a multiple of this prime number. 

The theory of confounding for asymmetric designs has not been developed 
to any great degree. Examples of asymmetric designs can be found in Yates 
[19], Cochran and Cox [9], Li [15], and Kempthorne [13]. Nair and Rao {16} 
have given the statistical analysis of a class of asymmetrical two-factor de- 
signs in considerable detail. 

Kramer and Bradley [14] discuss the application of group divisible designs to 
asymmetrical factorial experiments, however their paper is mainly confined to 
the two-factor case and its intra-block analysis.” It is the purpose of this paper, 
which was done independently of their work, to outline the general theory for 
using the group divisible incomplete block designs for asymmetrical factorial 
experiments. 

The use of incomplete block designs for asymmetric factorial experiments 
results in (i) no restriction that the levels must be a prime (or power of a prime) 
number, (ii) no restriction with respect to the dependence of the block size on 
the type of level, and (iii) unlike the previous referenced works on asymmetric 
factorial designs, the resulting analysis is simple, does not increase in difficulty 
with an increasing number of factors, and “‘automatically adjusts’’ for the ef- 
fects of partial confounding. 


Received January 10, 1957; revised June 18, 1957; revised November 1, 1957. 

1 This paper is an extension of results presented at the Annual Meeting of the American 
Statistical Society, September, 1954 (cf. [22]). 

2 Note added in proof: The Editor has pointed out that the paper by K. R. Nair, ‘“‘A 
note on group divisible incomplete block designs”, Calcutia Statistical Association Bulle- 
tin, Vol. 5, No. 17, (1953), pp. 30-35, together with Nair and Rao [16] essentially contains 
the results for the intra-block analysis of the two-factor asymmetrical designs. 

29 





GROUP DIVISIBLE DESIGNS 23 


Section 2 states three useful lemmas, Section 3 contains the main results of 
this paper, and Section 4 outlines the recovery of inter-block information. 


2. Some useful lemmas. 

We state here three lemmas which will be referred to in later sections. Since 
the prdofs are trivial they are omitted. 

Let X’ = (X,, Xo, ---, X,) have a multivariate normal distribution such 
that 


E(X") = m’ = (m , mM, *-* ; Ma); 
E\(X — m)(X — m)’] = Mo’. 
Lemma 2.1. The expected value of the quadratic form X'AX is 
E(X'AX) = m'Am + @ trace (AM). 
Lemma 2.2. Jf M* = \M (Xa scalar), then the quadratic form 


(X¥ — m)'(X — m) 


ny 


follows a ox distribution with r degrees of freedom where r S n is the rank of M. 
Lema 2.3. Define the direct-product of two square matrices A and B of di- 
mensions m and n respectively by 


a,B a2B oe 1m B 


de, B a22 B oe am B 
(A*°S) = 


Qm,B dnm2B +--+ GnmB 


If A’ = aA andB = 8B (a and 8 are scalars), then (A * B)’ = a8(A *B). In 
general, given p matrices A, B, C, --- such that A* = aA, B® = BA, C’ = yC, 
- we have (As BeC«---) = (aSy--:)(AeBeCe---), 


3. Analysis of group divisible designs used as asymmetrical factorials. 


3.1. Estimation. The group divisible designs are partially balanced incom- 
plete block designs with two associate classes. These were first discussed ex- 
tensively by Bose and Connor [3] and Bose and Shimamato [5]. A large cata- 
logue of such experiment plans giving full details of the analysis can be found 
in Bose, Clatworthy, and Shrikhande [2]. Designs with block size k = 2 have 
been enumerated by Clatworthy [7]. Bose, Shrikhande, and Battacharya [6], 
and Clatworthy [8] give methods for constructing group divisible designs. 

Briefly group divisible designs can be characterized by having b blocks with 
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k experimental units such that each of the v = mn treatments is replicated r 
times. The v = mn treatments can be divided into m groups of n treatments 
each, where any two treatments in the same group are Ist associates and two 
treatments in different groups are 2nd associates. With respect to any treat- 
ment, there will be (n — 1) Ist associates and n(m — 1) 2nd associates. 

Consider a factorial experiment with (g + h) factors A;, Az,---, Ag, Bi, 
-++ | B, such that the number of levels associated with A, is m, for s = 1, ---, 
gq and the number of levels associated with B, is n,r = 1, 2,--- , kh. Further- 
more, let these levels be such that m = [[?_. m, and n = [[*_, n,. Then one 
can use the group divisible designs for a TI’: m. x [[-1 n, factorial de- 
sign by arranging the » = mn treatments in an n X m array and assigning the 
m factorial combinations among the A factors to the columns (groups) and the 
n factorial combinations among the B factors to the rows. 

Let the measurement of the uth treatment combination (u = 1, 2,---, v) 
measured in the zth block be denoted by yuz and let the underlying mathemati- 
cal model be 


(3.1) Yu = M+ty, +O, + eu; 


where m is a constant common to all measurements, ¢, is the effect of the wth 
treatment combination, 6, is the constant associated with the zth block 


(z = 1, 2, --- , 6), 


and {€uz} is a sequence of uncorrelated random variables having a zero mean 
and (unknown) variance o°. For making all tests of significance, we shall further 
assume that the {e,.} follow a normal distribution. 

Due to the factorial nature of the experiment, a treatment combination ¢, can 
be written as 


g h 
i — z. (a,);, 7 yo (bq) ;, 
e=1 q=l 
(3.2) + 2 din + LD erie 
t=2 «= r=2 q= 
h g 
+ 2d 2X (a, bq) sig tere + (dy2...9 Diz...) ing «+g549°+-&* 
qi e= 


The (a,);, are constants associated with the main effect of A, at level 7, ; the 
(ast):,, are constants associated with the two factor interaction between A, and 
A, at levels 7, and 2,, etc. Similar interpretations hold for the constants asso- 
ciated with the main effects and interactions of the B factors, and also for the 
constants associated with the interactions composed of both A and B factors. 
It is well known that these parameters are not all linearly independent and 
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satisfy the following relations: 


| $ (a), = 0, 


j= 
i + 
a 
Fide. 


tqg@l 


(3.3) > (ber) ier = 
jgmh 


Ma 
 % (Gi2..-9 Di---) ing + gJ12--oh 
| dom 


= x (Gup..g Dun---a) ing---e509°-+2 = 0, a 1, 2, — 


ign 


If the adjusted treatment total for the uth treatment is defined by 


Q. = (uth treatment total) — ( 


sum of the block averages in 
which the uth treatment occurs /’ 


then the treatment estimates can conveniently be written as 


3.4) i, = —! _ [kQ, + 018,(Q.) + c2.$2(Q.)]. 


r(k — 1) 
Here S,(Q.) and S.(Q.) are the sum of the adjusted treatment totals for the 
Ist and 2nd associates with respect to treatment u, and c,, ce are constants 
calculated from the design parameters. (All catalogues of group divisible de- 
signs [2], [5], [7], [8], give numerical values of c; and ¢). 
Since these estimates satisfy the restraint }>.,f, = 0, the variance of a 
treatment estimate can be written as 


seat sl vk — [k + (n — lag + n(m — Del) 2 
3. Card, on | Sei eee en en ae 
(3.5) ar l ! a te o 

and the covariance between treatments which are (say) sth associates (s = 1, 2,) 
is 

ei oe of) | Ce — [ek + (n — 1), + nlm — Ie} | 2 

(3.6) Cov (t; ’ t;) = [eB 4 fe Bet afm — Hea o 

for s = 1, 2. 


Let (say) Ai, A2,---,A,(p Sg) and B,, B.,---, By. (q S h) be a selec- 
tion of the A and B factors and let them be associated with the particular 
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levels 7 = (#1, 72, +++, %) andj = (j1, Jo, +++ , jg) respectively. We define an 
S-function associated with these particular basins and levels by 






(3.7) 


Sfl,2,---,p; 1,2,°---,q]ij) == 2 iis 







where the summation >>\, refers to the sum over all treatment estimates which 












have the same levels 7, 72, +--+, %p3ji, je, °°* » jg With respect to the factors 
A,, As,--:,A,; Bi, Bo, ---, B,. If an S-function contains no A factors, we 
shall denote this by S[0; 1, 2, --- ,q/ Jj] with a similar notation for the absence 


of B factors. (Note that these S-functions are simply the cell averages in any 
(p + q) way table associated with these factors). Then the expected value of 
(3.7) is 





o = 
a{S[1, 2 “++ p; 1, , +g i, jl} _ > (a,);, * > (b,);, 


s=l r=1 


t qs 
5 die (det) ig, = >» 7 (bre) ir, 4.22 + (dy...p Dre ram 


s=1] s=2 r=1 


(3.8) 





af. 
iM: 


pil 





where the summations refer to the particular factors A; (¢ = 1, 2,---, p), 





B; (j = 1, 2,--+, q) and the levels 7,,...j,.... refer only tot = (t), t2,-+- , dp) 
and j = (j1,J2, °** »Jq)- There will be only (v — 1) linearly independent treat- 
ment estimates and since the relations (3.3) imply that there exist (v — 1) 







linearly independent factorial constants, the condition of unbiasedness is suffi- 
cient to insure unique estimates of the factorial constants. Therefore the esti- 
mates of the main effects and interaction parameters are given by 

(4:)i, = S{s; 0| te, 

| (be) ig = Sl0; @ | Jal, 

(Gs:)ie = Sls, #;0|¢., 2] — {S[s; 0 |e] + Slt; 0 | a}, 

(Dou)igu = SlO; 9, | Fes Jul — {S(O q | Fo) + SO; vj jul}, 

(3,9) | (aibadicie = S185 41 ts jel — {S[s50| 4] + S10; @| Jal), 







| (diz..-gDiz-. wins gil2-"h 

| = S[l, 2, - 1.8: 3° See ee Pee 

aa (S[L.2, «+: 99 — 151,2,--- , Rl ti, °° 5 bets jay *** s Jal fe weed, 
+ vee + (—1) "SI; 0 | i] + +++}. 


The estimate for a (p + q)th interaction involving the factors (say) {A,.}, {B,} 
associated with the respective levels z,, 7, (s = 1, 2,---,p;r = 1, 2,---, 4) 
can conveniently be written as 


(3.10) aici Ri ite oe ee (-1)"{w}, 
















where {w} denotes the sum of all S-functions involving exactly w S$ 
factors from the above set. 
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3.2. Variances, covariances, and tests of significance. In this section we 
shall obtain the variances and covariances of the main effects and interaction 
terms. It will be shown that these can be written as direct products of matrices 
and this leads directly to the appropriate sums of squares for the analysis of 
variance. Four lemmas pertaining to the S-functions are derived and are used 
for proving three basic theorems pertaining to the analysis. 

Lemma 3.1. The variance of S = Sf, 2,---, p;1,2,---,q| 4, jis 


9 


(3.11 Var S = ay (MN — 1)(k — c;,) + n(M — 1)(aa — &)], 


rik — yy 


whe re M = IL? 1™,, N : [jt Ihe . 

Proor. The number of treatments summed in S isv/MN = mn/MN which 
can be regarded as m/M groups of n/N treatments each, such that treatments 
within the same group are first associates and treatments in different groups 


° vee mi MN 
are second associates. Then there are Pe 


) different pairs of treatments 


among the mn/MWN treatments in S, of which 


m n/N = vin — N) 
M\ 2) 2N°M~ 


(3 - 1)(*) v 
2\uM N/ NM 


are 2nd associates. Therefore the variance of S is 

oM’N?({ v [ok — (k + a(n — 1) + nlm — 1) | 

Senile Cekeeee lt csagtne Ce eee ee 
v2 \|MN r(k — 1)v J 


2v(n — N) av — [k + e:(n - 1) + con(m — uy) 
2N°M r(k — l)v 


are lst associates and 


Var S = 








2(MN)? 


a ” 
2v(m — M)n 22 — [{k + ¢(n — 1) + en(m — 1)] } 
— fs 


r(k — lye 


which on simplifying gives the desired result. 


Lemma 3.2. Let S 1, 2,--+, p; 1, 2,---, @| 4, J] and 


S’ = Sl’, i bs alae : ow: 7 9 sand . q | r. 7) 


y— >) 


be two S-functions having a A factors and b B factors in common, such that for a, 
and b, of these factors, the levels are identical and for az and be of these (common) 
factors, the levels are different (a = a, + az, b = by + be). Then 


( Cov (S, S’) e (QiNi- D(k—-«) | 
3.13) ov (S, S’) = —— 7 
r(k — 1)v \+ n(M, — 1a — ce) | 
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where M, = [[%, m; (product of the levels of the a; factors having common levels) 
and N, = [['4:n; (product of the levels of the b, factors having common levels). 

Proor. The number of treatments summed in S and S’ are »/MN 
and v/M’N’ respectively. These treatments can be regarded as consisting of 
two rectangular treatment arrays of dimensions (n/N) K (m/M) and 


(n/N’) X (m/M’) 


respectively. The two arrays will overlap if they have common treatments and 
the number of such common treatments is 


uMiNi _ (wa (a.) 
(MN)(M’'N’) MM’']\NN‘/]° 
It is convenient to depict the intersection of the rectangular arrays by the five 
regions as shown below, 


Gr 


where region (1) is an array representing the common treatments having 
(nN,/NN’) rows and (mM,/MM_’) columns. If >> (7) represent the sum of the 
treatments in the 7th region (7 = 1, 2, 3, 4, 5), then 


S=PVO+L 4) +L 6), 
\s’ = (1) + (2) +d (3). 


Hence, in order to find the covariance between S and S’, it is necessary to find 
the number of pairs of Ist and 2nd associates formed from the multiplication 
of S and 8S’. These will give pairs formed from : (1), 7 (1)>> (2), 
~1)3), LHL® Cala, LEY, DMD), 
> (2) >> (5), and 3° (3) >> (5). 

Define 


(3.14) 


mn = 


VN 


mM, nN, 
yar? 


| m = 


MM”’ 
m 


wp SY ~ 4. 


! 
+M, = 


m 


| ™ = MM’ (M —_ M)), Wa = 


Then the dimensions of the five regions are: 


region (1): m Xm, 


region (2): m2 Xm, 





GROUP DIVISIBLE DESIGNS 


region (3): (m + mm) X m2, 
region (4): nz; X m, 
region (5): (ny + m3) XK m3. 


Since the treatments in the same row are Ist associates of each other and treat- 
ments in different rows 2nd associates, it is an easy matter to count the number 
of 1st and 2nd associates arising from pairs formed from _ a> (7). Perform- 
ing the necessary algebra, we find that the total number of Ist associate pairs 
1s 


vMi(n ~ N,) 
(NN’)(MM’) 
and the total number of 2nd associate pairs is 


vn(m M,) 
(NN’)(MM')° 

Therefore. 
o(MM’)(NN’) {uM Ni 
v? \ar M'NN’) 
- vM,(n — Nj) 
(NN'’MM’') 


vn(m — M;) 


| were Cov (2 iates) >. 
+ (WNT) Cov (2nd associates) > 


Cov (S, 8’) = [Var ¢] 


Cov (1st associates) 
‘ 


On simplifying we get the desired result. 
Lemma 3.3. Let (ab) = (@iz..-pbiz-.-¢)iz2---piia---¢ De the estimate of the (p + q)th 
factor interackon associated with the factors {A,}(s = 1, 2,--+ , p) and 


{B,}(r = 1,2,---, q)- 


Let S’ = S[{l; 2’,---, p’3 1, 2’,---, dv, 7] be an S-function which is not 
associated with all factors (regardless of level) of (ab). Then 


(3.17) Cov [(ab), S’] = 0. 


Proor. Let a be the number of common A factors between (ab) and S’, and 
a, and a, (a = a + a) be the number of these common factors having the 
same levels and different levels, respectively. Define 8, 8; , and 6: in the same 
manner with respect to the B factors. Since the interaction (ab) can be written 
in the form 


(ab) = (—1)?** 2d (-1)"{w}, 


consider a fixed {w} and a particular S-function in {w} having the characteris- 
tics a, a; , a2, b, b; , be as defined in Lemma 3.2. 
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Define 
(C(O, 0) = —1, 
C(a) = >> --- >> (mm, + *- m., — 1), asm, 
(3.18) ! C(b:) = >> +--+ DS (nny +++ Me, — 21), bh SB, 
| C(a , bs) = > --- > (mm, ++: Mg Nr,Nry *** Ney — 1), 


& 
aq >a, bh Sh, 


where the summations are only over combinations of A and B factors taken 
a; and b; at a time respectively, such that these factors are those in which (ab) 
and S have in common at the same level. 

Then the covariance between S’ and {w} can be written 


( . 
} t Fe P += eo 8 
| Cov [S . {w}] ph ss >» 
r(k cane 1)v [ate w— a, —_ b; 
| : (k — ¢e)C(a,,b;) +n (cy — c)C(a, 
| b, 
| 
| p+q-a-8 
(3.19) 4 —- > 
| a,;+b=w w— a, — bh 
be #0 


[ a\ (8 B 
; (k—c) —n (cy — ¢2)C(a,) 
b 


L ay, b 


¢? P+q—a—B\/ae 


eg=a W— de a2 


j 
[(k — 1) + n(qy — &)]>. 


Note that the first summation is for those S-functions in {w} for which 

a= be = Q; 
the second summation refers to ag = 0, bo * 0; and the third summation is 
when a. ~ 0. Since the covariance between S’ and (al) is 


(3.20) Cov [S’, ( “)] = (—1)?* (—1)” Cov [S’, {w}], 


we can substitute (3.19) in (3.20) to obtain an explicit expression for (3.20). 
Now with respect to fixed values of a; , a2 , b; , and b. the only terms contribut- 


ing to the first summation in (3.19) is when 
w=at+h,---,ptqtath—a-—B; 


the value of w contributing to the second summation in (3.19) is for 


w=a+b,---,ptqtatb-—a-B 
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and the contributing value of w for the last summation in (3.19) is when 


W= s,°°° DtQqta-a-— 6B. 


Therefore collecting coefficients of 


| ~, eee (*) — adctas | 
1 
in (3.20) gives 


p+q—a-—8 a coal 
(3.21) (<1) > * ‘ers °) (—1)” = 0 


w=0 Ww 


for all a; and b,; . Collecting coefficients of 


@) Ie) &— a) - aa — eC(as | 


p+q—a-—8 eoi> do, at 
(—y* y (? + q c *) (—3)* oa 0 


w=0 Ww 


results in 


for all a, , b} , and &. Finally, with respect to the coefficient of 


(@)ia — (1) + n(cr — c)] 


in (3.20) we have 


p+q—a—8 a: ce wa 
(—1)” p ® e ' q a °) (—1)” on 0. 


Ww 


u=0 


Lemma 3.4. Let (ab) = (Gis---Pus---e)ins pit2---¢ be an estimate of the (p + q) 
factor interaction associated with the factors {A;}(t = 1, 2,---, p) and 


(ByQ 1, 2, ea » q)- 


Let S’ = Sl, 2,---, p; 1, 2,---,q@\ 4, 7] be an S-function associated with the 
same factors as (ab) such that a, and 6, of the A and B factors have common levels. 
Then, 


2 
(- Eres 2, a a,)¢(1, 2, =e 81) a ’ 
| oTv 


(3.22) Cov [(ab), S’] = ¢ ifg ~ 0, 


| (—1)?*19(1. 2. «+. an) 7 — _ 
| ¢ 1)?"™0(1, 2, 1a) ifqg = 0, 
where 

6(1, 2, --- , a1) = (m — 1)(m, — 1) --- (ma, — 1), 


$(1, 2, +--+ , Bi) = (m — 1)(m — 1) -- (ms, — 0), 


(3.23) 
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and 
Pen eR 
(3.24) ° (k — ce) + n(er — ec)’ 
ie ge eS 
- k — & 


Proor. If we expand (a°}) in terms of S-functions (Eq. 3.10), we can write 
the covariance between S’ and a fixed {w} for q ~ 0 as 


2 ( 
Cov [S’, {w}] = ——— = 4 —c) >, Clm,d) 
r(k — 1) | 


a,+b,;—w 


+ nlc; — ¢e) 7 (*) (a) | 
a,+b,;=w by 
-| = S s ((k — co) + nlc - ol] 
(3.25 ) a2 ‘ey 


as #0 


) 
~ Leni en )(r) = 2 
(s)(m) com 


where the first bracket is when a. = be = 0: the second bracket is the case 
a2 ~ 0; and the third bracket refers to a2 = / be ¥ 0. Substituting (3.25) in 


— nlc, — C2) 
ata 


(3.26) Cov [S’, (a°b)] = (—1)?** r (—1)” Cov [S’, {w}] 
w=1 
results in the first bracket being written as (neglecting the constant term) 


ait+8; 
(— pyr teres | 7. (— 1 )" 2. C(a, b)(k — C1) 


w=l1 @,+b\;—w 


ait; 
w By ' 
ee +n(cq, —e (—1) Cla 
(zi =o) Zw EP) ceo] 
= (—1)?teter+F19(1 | 2, et , a)o(1, 2, Pe » Bid(k = 7 


ai+@q 
+ (-—1)"* ‘ner = a) YY Cla) Xe (-1 4 tae 
Pon 1 


w=a\ 


With respect to the bracket when a. ~ 0, we can write these terms after 
substituting in (3.26) as 


(328) fk — a + ala — ol |S (-1) " . 1) pare y(*! + ‘)] =0. 
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Finally for the terms where a, = 0, by ¥ 0, after substituting in (3.26), we can 


write 
( ai+¢ ait8y 
| (1)? | >, (~1)" (« _ 0) - - (—1)" (* t *) (k — a) 
w Ww 


| w=) w=) 
(3.29) § 


| 4 (—1)"**n(e, _ C2) | Sea) za -o[(@: vd . ‘o (,, . ») | 


The first term in (3.29) is identically zero and combining the second term of 
the right hand side of (3.27) with the second term of (3.29) gives 


ai ai+q 
(—1)"**n(e, — ¢) iz C(a) > (1 ( q )| = 0. 
a,=1 r=a, ae ay 


Thus the Lemma is true for g # 0. For g = 0, the covariance between S’ and 


fw} will be 


Cov [S’, }w}] = (k — e,) + ney — c2))C(a,) 


) 
a \f ae : a 

(h . @.) ae 
(RG) (0) +e — oo} 


and following the same reasoning as for q¢ # 0, we can prove the Lemma for 
q = 0. 


TnreoreM 3.1. Let (ab) = (ayo... pbi2---q)itg.. pitt... DE an estimate of the 
p + q)th 


factor interaction associated with the factors {A;,} (t 1,2,---, p) and 


{B;} (j = 1,2, ---,q); 


let (ab)’ = (Gag. --ie---2)é's0..c0 Senses be a (r + 8) factor interaction associated 
with the factors {|A;} (i = 1, 2,---, 17) and {B;}(j = 1, 2, --- , 8), such that all 


factors are not identical between ( ab) and (ab)’. Then the two different interactions 
are uncorrelated. 


Proor. (ab) can be expanded in terms of S-functions, such that no S-fune- 
tion contains all the factors of (ab)’. Hence by Lemma 3.3, the covariance be- 
tween all such S-functions and (ab)’ are zero which proves the theorem. 

THEOREM 3.2. The variance of the (p + q) factor interaction 


(ab) — Ciena baie: ciliate 
associated with the factors {A;} (i = 1, 2,--- , p) and {B;} (j = 1,2 


= 


Si o 
| eee aminntinn = 
; 6(1, 2, »P) Ey’ if q 
(3.31) Var (ab) = J; 
o 


| (1, 2, --- , p)o(1, 2, --- ,g) if g 
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Proor. Using Lemma 3.3, we can show that 
Var (ab) = Cov [(ab), S], 
where S denotes that S-function coinciding in all factors and levels with the 
interaction (ab). Hence, by Lemma 3.4 the theorem is proved. 
THEOREM 3.3. Let (ab);; and (ab), be two estimates of a (p + q) factor inter- 


actiqn associated with the factors {A;} (i = 1, 2,---, p) and 


j >“) 
{Bi} Gj = 1,2,---, 4) 
such that for a, of the A factors and 8, of the B factors, the levels are identical. Then 


(3.32) Cov [(ab)i;, (ab) 57] 


f 9 


| (—1)?*"4(1, 2, --- ,a) ——, if q = 0, 
E,rv 
p+qta,+8; / o of 
| (—1)?7*""""19(1, 2, --- , a @(1, 2, --- , 8) =—, ifq = 0. 
| E,rv 


Proor. Expanding (ab); ; in terms of S-functions, taking the covariance of 
(ab);; with each of the S-functions associated with (ab); ;- 


J 


and using Lemma 
) 


3.3, results in 
Cov [(ab) is, (ab)] = Cov [(ab) sz, S’I, 


where S’ is that S-function associated with the factors {A,} and 


(B;} (2 al 1, 2, io »P3J - 0,3, **° > 4) 
and levels t12...pj12--¢ « 
Hence, by Lemma 3.4 the theorem is proved. 
Theorems 3.1 through 3.3 give the variances and covariances of any general 
interaction. Define the square matrices M(t) and N(j) of dimension m; and 
n; , respectively, by 


(M(i) = mJ — J, t= 1,2,---,9, 
(3.33) ae 
\N(j) = nl — J, j= 1,2,---,h, 


where J is a matrix of appropriate dimension having all elements unity. Then 
the variance-covariance matrix of the estimates of the (p + gq) factor interac- 
tion on --q)itg...piig..-¢ Tanging over all the [][?-1 m:; [[j-:»; combina- 
tions is given by the direct matrix product 


(3.34) i ((1)*M(2)* --- *M(p)], fe = 6, 
or 
(3.35) o_ [M(1)* --- *M(p)*N(1)* «++ *N(q)], if q ~ 0. 


Ey rv 
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Therefore, since M(i)’> = m,M(i), N(j) = nj;N(j), and using Lemmas (2.2) 
and (2.5) the sums of squares 


ad Zz (4y2..., 
II Be, rere 


e=l1 


Eyrv 


p a + I j hy ie cet pane if q 
II m. [I 2, <r ; 
s=1 r=1 
follow x°o° distributions with T]2.. m, — 1) and [2 II: 1(m, — 1)(n, — 1) 


degrees of freedom respectively if the null hypothesis of no interaction effect is 
true. 


Using Lemma 2.1 these sums of squares have the expected values 


Diss) seco + LD (ms = V0? 


E,r 


p By - ss —— a Il —e IT ae ita 
A yan te ** 12--.4 i=l 7 


s=l r=] 


The entire intra-block analysis of variance can be summarized in Table 1 
where B represents the b X 1 vector of block totals, Q is the v X 1 vector of 


adjusted treatment totals, ¢ is the » X 1 vector of treatment estimates, 


zs 
(grand total)* 
bk , 


and terms such as 


are written as (dje...p)", ete. 

The computations for the analysis of variance are straightforward and ac- 
tually amount to treating the /,’s as observations on a one replicate factorial 
experiment, where all sums of squares are multiplied by E.r or E;,r. It is also 
clear from the analysis of variance that the various interactions are estimated 
with one of a possible two types of efficiencies. If the interaction is composed 
only of A factors the efficiency is £, , otherwise the efficiency will be E, . 

Extension to the balanced incomplete block designs. The balanced incomplete 
block designs can also be used for asymmetric factorial arrangements by as- 
signing the v factorial combinations to the v treatments of the balanced incom- 
plete block design. All results immediately follow by regarding the balanced 
incomplete block designs as a “degenerate” partially balanced design. Then 
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TABLE 1 


Summary of intra-block analysis of variance 


Source Degrees of freedom Sum of squares 


Blocks (unadjusted) 


Treatments (adjusted) 


| A, j S/0ey — 1 


| As | (mz — 1 
| | 
} 


| 


B, 


abi r 


oon Ey Pr Ay 


Tl Me Il n, 


Error Yi? 


Total 


¢ = @ = k/e wm (3.4), £, ky E v(k — 1)/k(v — 1), and all main effects 
and interactions are estimated with an efficiency factor E. 


4. The recovery of inter-block information. If the block effects 6; in (3.4) 
can be regarded as a sequence of random variables such that 


E(b;) = 0, Var b; = o, 
Cov (b; . b;) = \), for j t, 


Cov (€:;, bj) = O 


’ 


it will be possible to extract additional information from the block totals. This 
additional analysis is sometimes termed the recovery of inter-block informa- 
tion or the interblock analysis. With respect to the balanced incomplete block 
designs, the inter-block analysis was first developed by Yates {20] and appears 
in the books by Cochran and Cox [9], Federer [10], and Kempthorne [13]. The 
inter-block analysis with respect to the partially balanced designs is discussed 
in a particularly simple form by Bose and Shimamoto [5] and by Bose, Clat- 
worthy, and Shrikhande [2]. Generally it will be possible to use this inter-block 
information in two ways. The preceding references discuss how one may com- 
bine the inter-block information with the intra-block information in order to 
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obtain the most precise treatment estimates. The paper by Zelen [21] discusses 
how one can use this inter-block information for obtaining additional inde- 
pendent tests of significance for every hypothesis pertaining to the treatments. 

Define Q? = T,; — Q; — rg, where Q; is the ith adjusted treatment total, 7, 
is the total for the ith treatment and 7 is the grand average of all measurements. 
Then the best estimates of the treatments using both the intra- and inter-block 
information can be written as 


: ! ; : 
(4.2 t; = RG ‘1) {kP; > d,S,(P;) . d, S2(P;)}, 
wk — ) 


where 
‘P, = WQ; + W*Q%, 


Genes 21 
L” "Ee ul 
l 
W = —, W* 2-4. 
2 ¢ > kos 
The constants d, , d; are usually tabulated with all the designs. Note that (4.2) 
is the same form as (3.4) except that P; replaces Q;, R replaces r, and d,, d, 
replace c; , ¢2. Thus all results in Section 3 carry over directly by substituting 
the above changes in the formulas of Section 3 and _replacing o by unity. 

On the other hand, under certain conditions which are elaborated in [4], 
one can also obtain additional independent tests of significance using only the 
inter-block information. Three cases have to be considered depending on 
whether the group divisible design is a regular, singular, or semi-regular design. 
These are the three exhaustive classes of group divisible designs introduced 
by Bose and Connor [3]. 

For the regular group divisible designs the inter-block treatment estimates 
can be written as 
fos = QE + cf S(QH + cf QI) 


: 
and will have a variance of 
} * *; 
k—(k+ (n — — lez] |, 2 2 
Vor fl = = (h + (n Ler + n(m L si (a? + ko). 
rv 
Also if (7 and é} are sth associates (s = 1, 2), 


f 


ia = * ( — 1\r* 
Cov (tT, tf) = pec Reins ie sain tel 


| (o* + kaj). 


The quantities cf and cz are defined by 


= c,A — rr, 
F A-rH+r 
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where the parameters c,, 4, \,, and H are the usual parameters for partially 
balanced designs (ef. [2], [5]). Therefore all results for Section 3 apply directly 
by replacing o° by (o° + kos), ¢, by cf , and r(k — 1) by r. This results in the 
two efficiencies being 


ES - 
4a * *\ > 
k — ey + nlc; — co) 
l 
* 
Ey ] 
mn = & 

and the breakdown of the v — 1 treatment sum of squares, using only the inter- 


block information, is similar to Table 1. If b > v, there will be an independent 
estimate of o° + kos, thus permitting independent inter-block tests of signifi- 
cance for the main effects and interactions. 

With respect to the singular designs, the intra-block efficiencies are 


5, < 1, Ey = 1. 


Hence it is only possible to obtain inter-block estimates for those main effects 
and interactions associated only with the A factors. Since treatments in the 
same group are first associates, 1/n{t; + S,(t;)] represents the average of the 
group to which treatment 7 belongs. This average is estimated by 
ty* , 
(4.5) weeitias., 2-22-—. 
n Ewr k(m — 1) 

There will be m such estimates, thus making it possible to have S-functions of 
the form S[1, 2, --- , p; 0) 7] for p S g. Then all results of Section 3 follow by 
replacing E, by EZ and o° by o + kos. If b > m, this will permit an estimate 
of o + kez and thus we can have independent inter-block tests of significance 
for the A factor. 

The semi-regular group divisible designs have the intra-block efficiencies 
Ek, = 1, &y < 1. Therefore it is possible to obtain inter-block estimates of those 
main effects and interactions having the intra-block efficiency F,. These 


(v — m) contrasts can be found by using the normal equations 
. Ais x * ° 9 
(4.6) + ci = Qi, fe OP coe cw 
=] y 
where \;; = 7, and \;, = number of blocks in which treatments 7 and s appear 
together. The rank of Eqs. (4.6) is exactly (v — m). If b > (» — m), then it 


will be possible to have an independent estimate of (¢ + kos), thus allowing 
independent inter-block tests of significance for testing these contrasts. An 
open problem is to simplify this analysis. 

Extension to balanced incomplete block designs. Similar results apply to the 
recovery of inter-block information for the balanced incomplete block designs. 
The best combined estimate can be written as 


ee gp a- R&-1)+W — W*) 
‘“ (WE+W*0 — £)| ER’ ro R- 
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Therefore all results of Section 3 can also be carried over by substituting unity 
for o, R for r, ete. This produces an efficiency of 


R ” 


In addition if one wished to obtain additional independent significance tests 
using the inter-block information only, the treatment estimates can be written 


i* = _ Qi 
. (1 — E)r 


and all results of Section 3 follow by replacing o° + ko} for o’, and 


BE, = E, = 1 —.E. 


Again we will have two independent tests of significance for testing every null 
hypothesis pertaining to the main effects and interactions. 
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ACCELERATED STOCHASTIC APPROXIMATION! 
By Harry KESTEN 


Corneil University 


1. Summary. Using a stochastic approximation procedure {|X,}, n = 1, 
2,---, for a value @, it seems likely that frequent fluctuations in the sign of 
(X, — 6) — (Xn_1 — 6) = X, — X,_; indicate that | X, — 6) is small, where- 
as few fluctuations in the sign of X, — X,_; indicate that X, is still far away 
from @. In view of this, certain approximation procedures are considered, for 
which the magnitude of the nth step (i.e., X¥,,, — X,.) depends on the number 
of changes in sign in (X,; — X;_,) fort = 2,--- , n. In theorems 2 and 3, 


Xai — Xn 


is of the form b,Z,, where Z, is a random variable whose conditional expecta- 
tion, given X,,--- , X,, has the opposite sign of X, — 6 and b, is a positive 
real number. b, depends in our processes on the changes in sign of 


X; — Xii(i S n) 


in such a way that more changes in sign give a smaller 6, . Thus the smaller 
the number of changes in sign before the nth step, the larger we make the cor- 
rection on X, at the nth step. These procedures may accelerate the convergence 
of X, to 6, when compared to the usual procedures ([3] and [5]). The result 
that the considered procedures converge with probability one may be useful 
for finding optimal procedures. Application to the Robbins-Monro procedure 
(Theorem 2) seems more interesting than application to the Kiefer-Wolfowitz 
procedure (Theorem 3). 


2. Statement of the theorem. The formulation of the theorem is similar to 
that of the theorem given by Dvoretzky [2]. Let @ be a real number and 


Tae = 1. %. «-« ) 
be measurable transformations. Let X, and Y,(n = 1,--- ) be random vari- 
ables’ and {a,} a sequence of positive numbers and define 
(1) Xnailw) = Ta(X1(w), --- , Xn(w)) + Da(w) VY a(w). 


The sequence {b,(w)} is selected in the following way from the sequence }a,! 


Received March 28, 1957. 

1 Note added in proof: The author learned recently that investigation of the above pro 
cedure had been suggested by Professor H. Robbins long ago. 

2 X,, Y,, and Z, denote random variables, whereas z, is used to denote values taken by 
the random variables 
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a, 


a2, 


t(n) = 24+ D(X, — XX — Xi) 


<0 
“uzr) = 
oe i <>. 
Thus, every time (X; — X,_,) differs in sign from (X,_, — X,_2) we take another 
ie 


Bit aesles .:*'** 5 Cad MSs ¢ *** 5 Bad tele 


, Xn) be nonnegative func- 
tions and put 


2 
(4) €y = sup = B.(t,, *** 5 Bn), 


{ze} n=N 


(5) p(6) = inf inf 


m=1,2,-++ |z,—0| 28 


Z1,°°*,.2,—1 arbitrary 


THEOREM 1. /f 


((1 + Bn(a:,-°*-,2n))| tn — 0 


(6) | Tali, °°',%n) — O| S — 7,(%1,°** ,X%n) when (T, — 6)(xz, — 6) > 0 


lan(%1,°°*,2n) when (T, — 6)(z, — 0) SO, 


(7) lima,(%,°-°°,2%n) = 0 uniformly, for all sequences x1, %2,°*- 
t(n) > « with t(n) > ~, 


uniformly, for all sequences 
ry ’ Ze ; “*#-. 


, 


(9) lim €xy = 0, 


N+ow 


(19) p(6) > 0 for every positive 6, 


x x 
(11) > a, = > a < a, and tue & G 
n=l a= 


3In (5), (8), and in (13), 6, depends on z;, «++ , 2, as given in (2). 





STOCHASTIC APPROXIMATION 
cuvil sa 61! » 
C2) EY, | Xp,.°**', Ke) = 0, 
E(Y; 
lim inf lim inf 
P e n>0 720 O</z,—O0\ <r 
(13)° Z}.°**.2,__, arbitrary 
-Pi Tos or coe, X,)+ b,  # Ss 
lim inf lim inf 
n>2 720 0< |z,—O0\ <7 
2),.°**. Z_—} arbitrary 
PUT Axe po > Boas + bn ie ~ ’ i.=2 kore a Zant > 0, 
then X, converges to @ with probability 1. 

PRroor OF CONVERGENCE. Without loss of generality we take 6 = 0. Also 
we assume in the following E | X,| < «. This can be done, because replacing 
X, by 

__ e » 
ims EE ike] < A 
|\A if |X,|2 A 


Xi = 


changes the process only with a probability equal to 
P{| X,| > A}. 


By taking A large enough, this probability becomes arbitrary small. We fre- 
quently do not write all the arguments of the functions, e.g., we write 8, for 
8,(%,,°°*, tn). We shall first prove several lemmas. From 


E(Y,|Xi,-+--,Xn) =0 
and E(Y%, | X,,--- ,X,) S o’ follows immediately. 


Lemma 1. There exists a function p(6) with 0 < p(é) < 1 for 6 > 0, and such 
that 


>: > 01%, - Xb S 1 <1, 


=o <0| Xi, Ka} S 1 — p(s) <1. 


LEMMA 2. 


lim inf Pd Xess — Ss |X, °° Xa; >6 and t(n) & kf 


n~-o2 \ 


< 1 — p(o'(d)), 


* P{-|-} and £(-|-) denote conditional probabilities and conditional expectations respec- 
tively. 








44 HARRY KESTEN 


om 
lim inf P{ Xv ee [Xie Xas Xe 


lA 


n-2 a 


—5 and t(n)= i 
< 1 — p(o'(6)), 
where 
( (6) when p(d)a, S 6 

Ase’ 

p (6) = 4 
le when p(d)a; > 6. 
Ok 


Proor. Since f(n) = k, we have 6b, S a, and for X, = 6 


Xavi <= max (0, (1 + Bn) Xn ani ¥nl + bn Ye S Xe + bn E: = | 





bn 
0.  - = Zz. ed b, kK a, — p'(6) + 2 r. |. 
So by (8) and Lemma 1, we have 
( 1 5 
a Dies a (5)bn| <> . hoe i \ 
lim inf P< Xn4i — Xn, = eS > Maka, © ** = ge: ek t(n) = | 
n-»2 \ “< | 
ane ad p (0) St : = a 
< lim inf P(Y, = a — €|X1,°°°, Xn; an et, t(n) = k) 
n+o \ - / 


for every « > 0. 


Application of Lemma 1 gives the first inequality. Similarly we prove the second 
part of the lemma. 
Lema 3. For every k and N 


P{t(n)=k for n=2N and X,+0} =0 


(i.e., when Xni1 — X, does only change sign a finite number of times, then X, 
converges to 9). 

Proor. When /(n) is constant for n = N, then X, is monotonic for n = N. 
Therefore {X,} converges (possibly to + « or — x). Let the limit be positive, 
say X. But by Lemma 2 for every 6 > 0 and « < [p'(d)a,]/2, 


~ 


lim P{X.41 -X, > —-e and X,26 and t(n) =k foralln=> N 


N+w 


-|6< Xw and ¢t(N) = k} =0, 


so the probability that X > 6 is zero. Similarly the probability Y < —4 is zero. 
Since 6 is an arbitrary positive number, this proves the lemma. 

This lemma allows us to limit ourselves in the sequel to those sequences with 
i(n) — © and therefore b, — 0. 

Lemma 4. Let 6 be a fixed positive number. Then there exist positive numbers 
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MN and ty such that, whenever n = no, t(n) = te and | X,,| = 34, one has 


EY |Xnga|[Xay +++ )Xa5 [Xa] B48} S| Xa] — 7 060. 


Proor. Choose f such that a,(X1,--- , X,) < 6/2 for t(n) = t& and 


: 26 bp(S) 6 
(14) a, = min (jo 602’ +) . 


Then b, S a,, for ((n) = to. We distinguish two cases 


| T,.(X1, --+ , Xn) | 


be ‘i 6 
X.| & &1f. | = 5 


bE|¥,| Satbo< 


(b) | Tet Ets pei X,) } > a 


As an(X1,-°-- , Xn) S 6/2 for t(n) = t , we must have 7,-X, > 0 (ef. (6)). 
Let X, 2 6. Denote the distribution function of Y,(X,, --- , X,) by Ha(y | X). 
As Xau: = T, + b,Y,, we have by (12) and (14) 


£4 | Xeu||Xi,---, Xap Xez ar 

\ =F 
« —Tralba 

= [ (7. + b,, y) dH,,(y | X) — / (7. + b, y) dH,,(y | X) 
Talba Lee 


eh lf 
— 71d 


n nn 


ais, 
y dH,(y | X) — [ y dH,,(y | x) | 
—Trlbna 
_- ab, | y dH,(y | X) 


—Trlba —Ta/lb, 1/2 
, + 2b, | [ van. |x | ay | X) | 


bag 4b? o p(4) 
2b, “ = 7 + 6. ; 
+ 2b,0 T. T. + ; = T, +b 4 


But by (8), 


( a \ 
1T.| $1 Xa] + by (Pat Xe — o(s)$s |X.) — 1,2 


bn J 2 


for sufficiently large n, say n = me. For X, S —§4, the proof is similar. Thus, 
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in all cases 


see: alla ; ee a b, pd 
Ef ng || Xa, -+ Xn Ra ahs Ken eee, 
| ) } 
Lemma 5. For every 0 <6 < 8 < 8” 
P{é < lim inf | X,| < 8’ and & < limsup|X,| and t(n)— ~} = 0. 


Proor. Choose fy) and np , corresponding to 6 as introduced in the preceding 
lemma. Assume now 


P{é < lim inf | X,| < 6’ and 6 < limsup|X,| and #t(n)— o} > 0. 

Then there exist an n; = no and f; = f such that 

(15) P{é < liminf | X,| <6’ and 68 < limsup|X,| and |X,|> 6 
forall n 2m and by, = a:,} > 0. 


Now introduce a new process. 


Z,=|X;| if «=1, ,m 
and 
| Xn, 45 | if 6< Za; for 7 =0,1,---,i-1 
Zu+i = and ba, = @,, 
lo otherwise. 
Unless bn, ¥ as, or | X;| S 6 for some 7 = m, we have always Z; = | X,; 


and thus, by (15),-also 
(16) {6 < lim inf Z, < 6 and & < limsupZ,} > 0. 
But by Lemma 4 


OS BGgss | Xi, +**, Ae) BS Beha for n=n. 


So application of the semimartingale convergence theorem (Loéve [4], p. 393) 
shows that (16) cannot be true. This proves the lemma. 

Lemma 6. P{lim inf | X,| = © and t(n) > o} = 0. 

Proor. If the proposition were not true, we could find, analogous to the last 
proof, a process Z; with 


(17) OS E(Zay.| X1,---,X,) S EZ, for sufficiently large n, 


and 
(18) P{lim inf Z, = ©} > 0. 
But as we took E | X, | < ©, one would have 


(19) E | Za,| < ©. 
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However, (17) and (19) together are in contradiction with (18). This proves 
the lemma. 

From Lemmas 3, 5, and 6 one may conclude that with probability 1 either 
that lim inf | X, | = 0 or | X, | converges to a finite positive number. We now 
prove that the last possibility has probability zero. 

Lemma 7. 


P{|X,| convergestoX and 0<i<X <8 < =~ and t(n)—~~} = 0. 


Proor. Choose no and f) corresponding to 6, as introduced in Lemma 4. 
Assume now 


(20) P{| X,! convergestoX and 0<i8<X <i'<@& 
and t(n)— ~} 


Again there exist an n; = mo andat, = t such that 


(21) P{6é<|X,| <8 forall n=m and b,, = a,,} >0 


and a;,p(6) S 


By Lemma 2 we can choose n; and /, so that at the same time for n = ny, 


P<Xau—- Xn 2 - 
(22) : 
p(p(4)) 
> ’ 
P< Xan —X,8 
(23) | 
plo() 


& 


As before we construct a new process. 
Z,=|X,| if 
and 
| Xnigs | if 5 < Zayas < 8 forj = 0,---,7— landb,, = a,,, 


6 ; 
Zn, +i-1 _ Qt,4i-1 ? otherwise. 


{ 
Zni +i = 


From (21) follows 


(24) P{é<Z,<& for all n=m}>0, 


and thus, 


n ) 


(25) pi | Do (Liar — Ze) | < 2(8 — 8) for alln = n$> 0. 
\ | ken, | 


4 
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Denote 
E(Zr41 — Ze|X1,°°-, Xe) by mi(Xi,--+ , Xe) (=m, for short). 


By Lemma 4 and the construction of the Z-process, 
(26) my(X1, +» Xx) $ — Fp(6) fork = m, 


where 


[bmss if 6<Z2;<38 for j3=0,---,¢ and ba = a,, 


Cn, +i 


la,, +i otherwise. 


Further fork = n,, 


(27) var (Zya1 — Z | Xi, --+, Xe) S Ck [7 + Ey:| < cic, 
where 
p (8) 
iP ae 
= 
In addition 
(28) > Ge 2 Zz Gt+k- 


k=ny k=) 


By (25), 


P 3 (Zest — Z, — m) 
C4 


(29) : 
> m | — 2(8 — 3 for alln = m? > 0, 


k=n} / 


and thus, by (26) and (28), 


Pt Zz. (Zu41 — Z, — m) 


| kn | 


(30) \ n—ny 
> DX dys — 206" — 8) for alln 2 m)> 0. 
k=0 ; 
But for 


c\ "2-71 
(6) DY ay4z — 2(8 — 8) > 0 
4 =o 


we have by Tchebycheff’s inequality and (27) 
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o 4 - 


| n—n} 7 
P< 7 (Zi4s am Ze = my) | (3) Z.. On+k — 2(8’ —_ s)} 


\ | ken 4 kno J 


Cc > Fé 


< hk=n} 


= <<... - 
eX) ze Gtn+k — 2(8’ — 8) > 
| 4 imo J 


n—n) 


n 
* 9 . 
> Ec, S > 01,44 ETi+e, 
k=0 


k=ny 


Tr 4x = number of times Ca,4¢ = Gry 4e- 


As soon as the Z; process differs from the | X; | process, we don’t keep the same 

4;,4% for more than one step. Therefore Er,,4, S 1 + expected number of times 

that {cn,4¢ = G42 and 6 < Z,,4; < 6’ forj = 1,--- ,i and ba, = a;,} occurs. 
If 5 < Xn,4: < 8’, then by (22), 

P{Xnsinr > Xnj,as | 8 < Xn, 43 | < 0 j= ee | 

p(p(5)) 


9 , 


and b,, = a, and Xn,4; > 6} $1 — 


it ial ee 


,8 
5 ( \\e 
and b,, = a;,, and X,,4; > 6> $41 — mee) ‘ 
As we pick a new a; as soon as (X; — X,_1)(Xi. — Xi_2) S 0, we have: Ex- 
pected number of times that 


{Cn,4+¢ = Gy4e and 6 < Zain; < 8 j=1,---,t@ and b,, = a,,} 


n 


under the condition that Y,;,, 2 X; > 6 for the first 7 n, with bj = aya, 
is at most 


ai ad 2) ( ote) 3 
33 24+(1 — Bw , — POEO)V <3 
: ( , se 2 = D0) 


The case where X;,, < X; at the first time that b; = a;:,4. is more difficult. 


Let us divide the interval (6, 6’) in 


[2 _ I = 
p(d)at,4% 
non-overlapping intervals’ J, with 


5 fa} is the largest integer S a. 
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(5) ; 2(3’ — 
length (J,) < rans (: =1,2,--. a c ») 4 1). 
= p(6) dt, 4% 
Expected number of times that 
{Ca4i = Qy4e and 68 < Zy4;< 8 fj = 1,---, 8, Zac € Le} 


under the condition that Xj., < X;, X; > 6 for first 7 2 m, with bj = an4:, 


is at most 
i) ( Bie)’ 2 
1 }.~ 2 l=- a 
* ( 2 * 2 P(p(5)) 


This can be proved analogous to (33) using (22) and the fact that 


length (1,) < [p(6)az,+x]/2. 


| 2 ne | i % 
p(5)at, 4% 


intervals J; , expected number of times that 


As there are 


{Cnj4s = G42 and 6 < Za4;5< 8 7 =1,---,t, and ba, = ay} 
under the condition that X;,, < X;, X; > 6 for the first 7 = mn, with 


b; _ Qty +k ’ 


(F2(8’ — 6) ) 
2{{ | . . 


is at most 


p(p(d)) 
Similar estimates are valid when X; < —6 for the first 7 = m with b; = ai,4:. 
AS 41,4% — O(k — ~), we can find a positive constant D such that 
D 


Ervin S . 
At,+k 


By (31) and (32), it follows that 
|< | 6) 
P{ |X Cay — Ze m) 2eL 


(34) 


Dt, +k 


— | n—n 7? 
é) . ' 
ve dX 144% — 2(8 — a} 


As Domo Gt,4k = ©, the right hand side of (34) tends to zero when n — ~ 
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and therefore (29) cannot be true and 
P{| X, | converges toX #0 and t(n)— ~} = 0. 
Combining the remark after Lemma 6, and Lemma 7 we proved 
(35) P {lim inf | X, | = 0} = 1. 
Until now we only used that a, tends monotonically to zero and 
Zz 
ai An = «2 , 


but not yet , a, < ow, 
Lemma 8. Define 


f 


[ ..8 Rh,--- S38 
\-1 eS it-«- 255. 64 


Yi, = Y, II] sf), 


j=l 


s(n) 


d(m,m—1)=1 


d(m, n) 


Il (1 + 8;) (n = m), 


S(m + 1, n) 


ll 


> dj + 1, nb; Y}. 
j=m 


Then the conditions 


Omsj-1(X1, °°* » Kenpj—r) S = j@= A, * yk, 
d(m, ©) <3, 
x3 «%. 
| a 
\Xusl>5 j=l k—1 
4 
and 
is tobi «2. 
sup | (m+ 1,n)| < 6 
imply 
Xess | S = j=il,---,k. 


The proof follows immediately from Wolfowitz [6], p. 1154. We need the fol- 
lowing 
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Coro.tiary. If t(m) is so large that 


> € 
Omyj-i(X1,°°* , Xmyji) S = 


then 


Ps | Xuory | > 5 ~ for some positive integer j| | s¢} 


P{ sup |S(m +1,m)| = sah 
) 


ny.neom 6) 


92 oo 
< 3? > var {d(j + 1,n)b;¥!} < (4 p> > Eth. 
I@~=m 


€ 
Proor oF THEOREM 1. In view of Lemma 3, we only have to prove 
P{lim sup | X,| > 0 and t(n)— ©} = 
By condition (13) 
e = min (lim inflim inf P{X.4,—X,20|Xi,-++, Xn-1,Xn = Zn}, 


n+ 120 O<|z,\/ 57 


lim inflim inf P}Xay— Xn, < 0| Xi,-+-,Xni1,Xa = tn} ) > 0 


nx 720 0<lz,/S7 


Take — > 0 and m such that 
P{Xasi ore Be e 0| X,, ea ; aon 1, Xn 
tsa =~ Ra < Ol As, -°* 5 Rae, Bx = 


forO0 < |2,| S gandn= m. 


Choose an e S ~£ and & such that 


(36) 


an(X1,°°+,Xn) S= when t(n) = 


and let 


Let now for some m = nz 
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Construct the following process 
Zs” =X, if k=1,---,m 
and 
. _ ‘i pi-1(Zy “heii Zm+i-1) 
+ Cm4i—1¥ moi-1(X1, °°* » Xmge-s1)(¢ = 1, 2, --- ) 
where the c’s are determined in the following way: 


Cm = Dn = Ct(m) 


mit if |Zms|S5 jf =0,1,---,i 


and (Bets = Zari- (Zain os Zm+i-2) > 0 


= : 2 € ; z 
(37) Cues = ay if | Zm+i | = 5 a= 0, 1, tee 


and (Zins _ Zari-(Znyi-1 — Zunri~-2) Ss 0 


and Cmij-1 = G11 
im)+; Otherwise. 
Then Er; = expected number of times c,,.; = al is zero when | < t(m). For 


l = t(m) it is at most 


38) l+(l1-H+(1-g)--- = 


“| 


In fact from (36) and (37), 
P{emsj = Cm+s1} S 1 — Ff. 


Using (38) and applying the corollary of Lemma 8 to the Z(m) process, and 
thus replacing the b’s by the c’s, one finds for m = ne 


-? 


( ' 
P<|Xu+s| > > for some positive integer j | inal & 7° t(m) = } 
| -_ | J 


AN 


Pd| £13|> = for some positive integer j|| ZS" | < qo t(m) = ab 


48c\? < a’, 
(#) 5 


Now choose ¢, = t. such that 


IA 
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and n; => mn» such that 
\ 


P| ma > forall n=n; or t(n3) < | and i(n)—-> > “> 


=5 
(such an n; exists by (35)). Then 
Pilim sup| X,| >e and t(n)— ~} 


( ) 
= P< [ << > forall n =n; or t(n:) < | and t(n)—- 1 


0 sc 


>< X, isthe first after X,,.. with |X,,.| S$ 


af 
~ 4 


mM=n3 


we € 
and t(n;) = t; and max | Z,(m)| > zl 
k>m “ 


f 


> PX, isthe first after X,,-1 with |X,|< 
mM=n3 \ 


As the only restriction on ¢€ is « S &, this proves the theorem. 


3. Applications. 
Accelerated Robbins-Monro procedure. 


THEOREM 2. Let X, and Y(x) be random variables and {a,} a sequence of post- 
tive numbers and define 


Xaas(w) = X,(w) — b,(M(X.) — a) + bY (X.). 


The sequence {b,} is selected in the following way from the sequence {an}: 


= mh, 
mt Betad 
(ef. (2) and (3)). 
If M(x) is a measurable function satisfying 
(39) (x — 0)(M(xz) — a) > 0 for z = 0, 
(40) inf |M(z)-—a\|>0O _ forevery 6> 0, 


6<|z-0\| <0 
(41) |M(xt)-a|l|Sct+d\|z— 0} for some positive constants c and d, 
and if 


(42) > dn = %, ; , and any S Gn, 


n=) 


(43) B(Y(X.)|%:,---, Xe) DPE X..--+, KI Se 
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or 
cu 


with probability 1, 


lim inf P{¥(z) — M(x) +a2=0} >0 
720 O<|z—-6\sr 


(44) 
lim inf P{Y(z) — M(z)+a<0} > 0, 


720 0<|z-6\ sr 


then 


P{X, converges to 0} = 1. 
Proor. Take 
b,(c + d| 2, — 6)) for b,d > 1 
oe he for bd <1, 
8, = 0, 


Ya ™ b,, | M(z,) —_ a | 


in Theorem 1. 


The process as described in Theorem 2 gives a stochastic approximation 
method for the point @ which uses the number of changes in sign in 


(X; — Xia)(Xer — Xi-2) t= 3,-+-,M 
to determine (X,4: — X,). We only reduce b, and thus the magnitude of 
Xnsi — Xen 


when the last two corrections X, — X,-, and X,_1 — X,_2 had different signs. 
As indicated in the summary this process may pull X, to @ faster (for large 
|X, — 6|) than the Robbins-Monro procedure. In Theorem 2 the conditions 
are slightly stronger than for the Robbins-Monro process as given by Blum 
[1]. Blum does not need 


An+1 ss an 


or (44) and has 


(40a) inf |M(z) —a| >0O forevery 0<5 58 < = 


8<|x-0|<3" 


instead of (40). 


One can easily give an example to show that we cannot replace (40) by (40a) 


and the following example shows that (44) cannot be dispensed with. 
ExampLe. Take 
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Let {2n+4:,0}(m = 0, 1, --- ) be a sequence of real numbers such that 
Ton41,0 ~ Lames for n#Am and 1 S Zenyio S 2. 

Let {2eno}(m = 1, 2, +--+) be a sequence of real numbers such that 
ono % Tomo for nX#m and —2 S tao S —1. 


We now construct recursively sequences {z,4}/(k = 0, 1,--- ). Put 


Z(x) = M(x) — Y(a) 
Lat = Z(2n), 


Xnti = Xe = baZ (Xn) 


We start with {2,,} by taking 


with probability 3 


with probability 3 


, 


” . ” . 
where 32; 9 < 210 < %,0. Further take 2); = 210 — 210, and in general 


P 21.4 = Tin — 229 with probability 3 
41k = ” . _— 

l2a., with probability 3 
and 


where 


For n > 1 we take 


(n — 1)(2n.0 — 2n41.0) with probability cz 
2n? 


with probability 1 — 72? 


Trl 


A ° 
where 2,0 is such that 


” '‘ ” ' i 
Zn.0°In0 > O and 3)| 220) < | Zao] < | tno 


and x, is not equal to any 2»,; with m < n. Further for k > 0, 
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= N(In~ — In41.0) with probability a3 


| ne with probability 1 — 53 . 


li» 


Injk+] = Ine ~ — Zn,ky 
n 


” ° 
where z, , is such that 


ZntInk > 0, 4] tan] <| zn] < | tae 
and 2,441 is not equal to any 7m,; with m < n. We take M(z,) = EZ, and 
Y(tnn) = Zax — M(rax). 


For « # 2, for all n, k, we take M(z) and Y(z) in any way such that the con- 
ditions of Theorem 2, except (44), are satisfied. 

Take now X; = 2:9 with probability 1. By the choice of z,,, we get the 
value Z,41,0 aS soon as Z,, takes the value z,,. But for every n, with prob- 
ability 1, Z will take once the value z,,. Therefore with probability 1, all the 
values Z,.o occur in the sequence {X,} and thus, 


P{X, converges to 0} = 0. 


Accelerated Kiefer-W olfowitz procedure. 
THEOREM 3. Let X, and Y(zx) be random variables and let {a,} be a sequence of 
positive numbers and u some positive constant and define 


Xnsi(w) = X,(w) — b,[M(X, — u) — M(X, + u)] 
+ 6,[Y(X, — u) — Y(X, + u)]. 
The sequence {b,} is selected in the following way from the sequence }a,}: 


bh = a, 


ba = Ain) 5 
(ef. (2) and (3)). Jf M(x) is a measurable function, satisfying 


inf {M(x — u) — M(x + u)} > 0 
é 


z—-@> 


inf {M(x — u) — M(x + u)} < 0 


z—0<-35 


(45) 


for every 6 > 0, 


(46) | M(z — u) — M(x + u)| Sc+d|zr—- 6! 
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for some positive constants c and d, and if 


i) 
(47) 0 7 a, < ©, and day S an, 
n=1 


ard. «+ 0) ~ 7. + OR, *, XD = 
BAYS. « «) = 1X. + WF Ri, >, XK 


(48) 


with probability 1, 


lm inf P{Y¥(t@—u) — Y(ir7+u) —-M(ie-—aw + 


720 O<lz-O0l| sr 


> 0} 


lim inf P{¥(@ —u) — Y(r7+ u) — M(@z — u) + M(x + wu) 


tT20 O0< '2r-86 


< 0} 


then 


P{X, converges to 6} = 1. 


PROOF. Take 
for b,d 


for b,d 


vn = b,| M(x, — u) — M(an + u) 


in Theorem 1. 

REMARK. Theorem 3 is also implied by Theorem 2. The procedure in Theorem 
3 requires u to be independent of n, and therefore differs from the usual Kiefer- 
Wolfowitz procedure ({3]). Also condition (45) does not imply that M(x) has 
a maximum, or if it has one, that @ is the location of the maximum. However, 
for every y with | y — @| > u, there exists an z with | « — @| S wu, such that 


M(x) > M(y). 


Acknowledgement. The author wishes to thank Professor J. Wolfowitz for 
suggesting the possibility of generalizing Theorem 2 and for his many other 
helpful and critical remarks. 
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TESTING THE HYPOTHESIS THAT TWO POPULATIONS 
DIFFER ONLY IN LOCATION’ 


By BALKRISHNA V. SUKHATME 


Indian Council of Agricultural Research, New Delhi 


0. Summary. Let X,, Y2,--- , X, be n independent identically distributed 
random variables with cumulative distribution function F(x — &). Let 


&(X,, X2, °°: , Xz) 


be an estimate of — such that ~/n(£ — &) is bounded in probability. The first 
part of this paper (Secs. 2 through 4) is concerned with the asymptotic behavior 
of U-statistics modified by centering the observations at £ A set of necessary 
and sufficient conditions are given under which the modified U-statistics have 
the same asymptotic normal distribution as the original U-statistics. These 
results are extended to generalized U-statistics and to functions of several 
generalized U-statistics. The second part gives an application of the asymptotic 
theory developed earlier to the problem of testing the hypothesis that two 
populations differ only in location. 


1. Introduction. Let X,, X.,---, Nn and Y,, Yo,---, Y, be two inde- 
pendent samples of observations from populations with cumulative distribution 
functions F(x — &) and G(x — n) = Fi{(x — n)/6] respectively, & and 7 being 
the unknown location parameters and 6 a scale parameter. No knowledge is as- 
sumed concerning the distribution functions F and G except that they are 
absolutely continuous. The problem considered in this paper is that of testing 
the hypothesis that the two populations differ only in location against the 
alternative that the Y’s are more spread out than the Y’s and vice versa, or in 
symbols 

H:5 = 1, 
(1.1) 
A:6 ¥ 1. 

From intuitive considerations and the work of Fraser [1], it seems likely that 
there do not exist similar tests for testing the hypothesis H, which are very 
satisfactory. The following simplified problem was therefore considered by the 
author [2]. Let the location parameters £ and 7 be known, say — = 7 = 0, so 
that the distribution functions of X and Y differ only in the scale parameter. 


Then the problem considered is that of testing the hypothesis 
H’:5 = 1 ie., F = G, 


> 


A:6 ¥ 1, le., F ¥ G. 


Received May 18, 1956. 

1 This research was performed while the author was at the Statistical Laboratory, Uni 
versity of California, Berkeley, and was supported by the Office of Ordnance Research, U.S. 
Army, under contract DA-04-200-ORD-171. 


60 





TESTING THE HYPOTHESIS 61 


Several nonparametric tests have been suggested for testing the hypothesis 
H’, particularly by Mood [3]. The author [2] has considered some of these tests 
and discussed their asymptotic properties from the point of view of power con- 
siderations. These tests are based on what are known as generalised U-statistics 
and are reasonably efficient. But our main interest lies not in testing the hy- 
pothesis H’ but H. However, once we have a class {Wy} of tests for testing the 
hypothesis H’, a class of tests {Wy} for testing the hypothesis H suggests it- 
self. This class of tests may be obtained as follows. We obtain suitable estimates 
of the parameters — and n and then apply any of the tests of the class Wy to 
the deviations of the X’s and the Y’s from the respective estimates. 

If the X’s and the Y’s come from normal populations, the usual test of signifi- 
cance for testing the hypothesis 7 is the variance ratio test based on the sta- 
tistic 


‘2 (y, — F) 


12 0d ei 
7 —~ — n—1l1’ 
> (xX; — X)’ 
=1 


which is also the most commonly used statistical test for comparing sample 
variances. Usually, however, since little is known about the populations from 
which the samples are drawn, this test is used as if the assumption of normality 
could be ignored. It appears, however, that this is not the case. 

The sensitivity to non-normality of the F-test was first pointed out by E. S. 
Pearson [4] whose findings were later confirmed by Geary [5] and Gayen [6]. 
They showed that the F-test is particularly sensitive to changes in Kurtosis 
from the normal theory value of zero. It is easy to see that the F-statistic when 
suitably normalised is asymptotically distribution free. More recently, Box 
and Andersen ({7] and [8]) have studied this problem in great detail and have 
shown on the basis of extensive sampling experiments that the F-statistic so 
normalized is insensitive to departures from normality. 

Since the tests considered in [2] are nonparametric and reasonable for normal 
alternatives, it appears that they might be more efficient for non-normal al- 
ternatives and also more stable for small samples. We propose, therefore, to 
investigate whether such tests, after modification by the introduction of esti- 
mates of parameters are asymptotically distribution free. 

This is achieved by considering the asymptotic theory of generalised U- 
statistics modified by the introduction of estimates of parameters, which is 
given in Secs. 3 and 4. In Sec. 5, it is shown that the nonparametric test pro- 
posed in [2], after modification, is asymptotically distribution free for popula- 
tions with bounded and symmetric probability densities. It turns out however 
that even under such restrictive conditions, the nonparametric test proposed 
by Mood, after modification is not asymptotically distribution free. Finally, 
the last section considers the small sample behavior of the proposed test for 
some particular alternatives. 
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2. Some definitions and known results. 

DeEFmITIon 2.1. Let Xi; , 7 = 1, 2,--- , n; for a fixed 7 be independent ran- 
dom variables identically distributed with ¢c.d.f. F(x) and density function 
fi(x). Let 7 run from 1 to k and s; S m, ss S me,--: , & S m. Further, let 


(tr °° * » Mey 52, °°* y Veg 5 °° > W1, We, °° » Wey) 


be a function symmetric in each set of its arguments. Then the statistic 


—1 —1 ee | 
, Ny Ne Ny 
Uy = 9 
81 S2 Sk 
E 2 ¢(X1,0, oot X1,00; ; X2.8, Bose ae X28, ; ee Xx. os re a X;, 8s,)) 
where the summation runs over all subscripts a, 8, 6 such that 


1 mm <a < +++ < ay, ny 


LSfi<h coe < Ba Ne 


156 < & roe <6 


ek 


is called a generaliesd U-statistic. 

Let pi, p2,°** , px be & fixed numbers such that n; = Np; and > iat a= 1. 
Then Lehmann [9] has shown that ~/N[Uy — EUy] is asymptotically normally 
distributed with mean zero and asymptotic variance o° given by 


9 


2 2 2 
$1 82 Sx 

— « €100.--0 + - [o10---0 t+ --- + — [00.--01 
Pi p2 Px 


where 
f00--.1---0 = Ege — (Egi]’, 
1 occurs at the 7th place in f...1...0, 
a = o(Xu,°°>, Xu 3°°* ; Xa, Xa,°° » Xi 0') 


and ¢2 is obtained from ¢; by replacing all the X;, by Xn excepting Xa , the 
primes denoting a new set of independent random variables. This result is a 
generalisation of the U-statistics considered by Hoeffding [10]. 

For the sake of simplicity, we shall restrict ourselves to the two sample prob- 
lem only. The extension to k samples is straight forward. 

DEFINITION 2.2. As before, let X1,--- , Xm and Y;,---, Y, be two inde- 
pendent samples drawn from populations with c.d.f.’s F(z — &) and G(x — n) 
respectively. Further, let 
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&(X,,---, Xm) and 4(Y;, --- , Yn) be estimates of & 


and 7, the two location parameters. Then the generalised L’-statistic with the 
observations centered at the respective location parameters and the modified 
generalised U-statistic for the two sample problem are respectively, 


—1 1 
Uy aa ("") (") D~ o(Xe, iad g, ast si Xe,, = &; Ys, Sa. Y,, = @j, 


°1 82 as 


ae (") (") Leo(Xe, — §--- a. é; Ys, — 


where ¢(u; , --* , Ue} U1, °** , Us,) is a function symmetric in u and in v and the 
summation runs over all subscripts a, 8 such that 


laa <ee <a, SM, 


1SBii<h&<--- <8, Sn. 


DEFINITION 2.3. Let Wy be a test based on the statistic Cy . If the asymptotic 
distribution of Cy is independent of the original populations from which the 
X’s and the Y’s are drawn under the null hypothesis, the test Wy will be said 
to be asymptotically distribution free. 

Finally we define a quantity Ly required in the study of the asymptotic be- 
havior of modified generalised U’ statistics. 

DEFINITION 2.4. 


t= (* 


where 
A(t —&t—7) = Eo(X, —t,---, don — th; 
expectation being taken with respect to all the X’s and the Y’s. 


3. The limiting distribution of Ly. In this section, we will prove theorems, 
giving the conditions under which Ly and. Uy have the same asymptotic normal 
distribution. We will start with one sample problem and then extend the result 
to two samples. In what follows, we write £(X,,) — £(X) (read: the distribution 
law of X, converges to the distribution law of X), or lim,.. £(X,) = £(X) if 
F(a) — F(a) at every point a of continuity of F where F, and F are the c.d_-f.’s 
of X, and X, respectively. 

THEeoreM 3.1. Let X,, X2,-+-- , X, be n independent identically distributed 
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random variables with c.d.f. F(a — &). Let p(ur, We, ++: , Us) withs S n bea real 
valued symmetric function of its arguments such that if 


(3.0) W(21, 22, °°* ,%2,t) = O(a —t,-++,% — bt) — £ 
where A(t — £) = Ege(X, — t, --- , X, — 1), the following conditions are satisfied . 
Way, a2,++:,2.,0)) S M,,and E|W(X,,---, X.;¢+ h) 
— W(X, ---, X38) S Moh, Mi and Mz being fixed constants 


(B,) 


There exists a sequence {t;| such that for each set of x’s 


sup | W(x,,---,2,,t;) — Wl, ---,2,,9) 


0 


(Bo) 


= sup | W(x, ---,2.,t) — W(xy,--+ , 2,9) |. 
each 


Further, let £(X,, Xz, +--+ , Xn) be an estimate of & such that given 2, > 0, there 


exists a number b such that Jor n sufficiently large 


PAlé—t]>—S<2,. 
a | V/nf 1 


Define 


=i 
v. = (") > ¢(X., —E,+s-, zz 


Ss 


the summation being taken over all subscripts a such that 
lSa<a<-+::-+<a,fn 


and 


Le =(") Llel., 


Then 
lim £(/nL,) lim £(./n[U, — EU,)) 


N(O, sty ) ’ 


where 


Fe Eei(X, — t) — F¢e(X, — §&,--- 


oi(2y = &) — Ee(2; —~ aa X2 ns é, oe 
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Proor. For the sake of simplicity we may, without loss of generality, assume 
£ = 0. However, before we proceed further, we shall first prove the following 
lemma, which we shall use in the proof of the theorem. 

LEMMA 3.2. Left 


ae 


and 


‘) 
» Zags - 
Vin 


Then, if r6 < t S (r + 1)6 and n is sufficiently large, 
16 26 
LS) a 


Vn J Vn 


a r oH) 
Vn 


— EH, (x, i Soe +2) 
Vn 


| a (x. ote = = 
Vn 


u 


— EH, (x... ae y= a) Tl —Oasn— <«, 


= @“\ @“% *°°* 


Shi < fe <---> 


see 7 ’ 2 a( — ré) 
3.9) (Gn) 71S... |" s at - ; 
Vn 


where d is a fixed constant and higher powers of 1/+/n are neglected. 
Proor. (i) and (ii) are easily obtained as consequences of conditions (B;) 
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and (B,) of Theorem 3.1. To prove (iii) we have 
E Sy n(t) |? 


= 2 (") Z E ‘|W x: are a . . ) ~ WW i: see Xa,» 6) 
8 ap \ V/ n Vn 
? ‘ t = 
0 (Kase Xa, ~) W(X i ia +) |}. 
| , ; Vn , ' Vn 


Consider a typical term; with c integers common to the two terms. We then have 


E{| W (Xa, pry ie ‘.) «ie ee. nig : 
k Vin Vi 
X .) —w(x “sae YI 
aed = “4 Bi) Xa 
, t ré 
° Xa, oe -) ~~ (x. pore Xess | 
Vn Vn 
sl as ‘ 6 
+ oe ‘) iW (x si nied gl *.)| 
° vn : P Vn | 
‘iwix , t 7 — 
<2ME W ( 5X) = im. +++, Xoyy 7) 
Vn Vn 


uu, ¢—* 
~+/ n 


The total contribution of such terms to 


we pail n\~* n . ) 
E\S,.n(¢)| n (") bs _ .) - A(t — rb) ‘Vn, 


A being some fixed constant. It follows that 


1 


e 


= (t — 1rd). 


nm 


When c = 0, the expectation of the product is zero. Retaining only powers of 
1/-/n, the result now follows. Q.E.D. 
Proor oF THEOREM 3.1. Let 


a 
Ss = Jn (") Za lw (x. ee <2) — W (x. ee 0) |. 
‘ a n 


Then it is easily seen that 


S,(0) = S;n(t) + So,n(r6). 
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Let « > 0, 6 = €/2M2, and ¢ be such that r6 < ¢ < (r + 1)6. Then it is seen 


that 
; t 
Vin 
i. l ’) 
“is Vv nN 
r+ *) 
"Vn 
r+ ve 
se Vn 
(r+ ue 
7 a 


\ i) 
ae Vn 


where 


D, = vn (") | 


~~ ( ( 1)é 
. (") > e{| He (m. se ie, PES *) 
8 a. { Vn 
; 7 _ (r+ 16 
aa EH,» Bax _ ans ) 
Vn 


(r+ 1)6 5( ] ) 
| a, ; (Xs. .-»,Xp,.°—F ) — FH, (x, re a yi 
Vn Vn ) 


the summation having the same meaning as before. Considering again a typical 
term with c integers common to the two terms, we find that the total contribu- 
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tion of such terms to ED; is 


x, £422) 
a) 
ae 


2 l. When ce 0, the expectation ot 


: V n 
— / n 
, NV n 


which tends to zero by Lemma 3.2 for c 
the product is identically equal to zero. 
Hence ED; Oasn— ~. It follows that 
P} sup S,..(t)| > «| 0. 
6<t< (r+i)é 
for every r 
Also ES ,.(r6)* < 2MyMoré / /n — 0 as n — ~, therefore, 


It follows that sup; , C being some bounded set. Hence 


2 


5 (y ni) 


that is, 


therefore 

lim lim ; 

ny ak, = n+» L(y n(l 
But by Hoeffding’s Theorem 7.1, page 305 of [10], U,, is asymptotically normally 
distributed, whence the required result follows. Q.E.D. 

We complete this section by stating without proof the generalization of the 
ubove result to the two sample problem. The proof goes more or less along the 
same lines as that of Theorem 3.1. 

THEOREM 3.3. Let X,, X2,---,Xmand Y;, Yo, -+-- . Yn be two independent 


samples drawn from populations with c.d.f.’s F(x — &) and G(x — n) respectively. 
Further, let ¢(uy , +++ . Uey3 U1, * ++ 5 Ug) With 8, S m and 8s. S n be a real-valued 
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function symmetric in u and in v separately such that if 
W (ti, Hig *** 5S 5s Mis ¢ Hes G19 


(3.10) 
At — hy: - fy ~ GH — 4° 


the following conditions are satisfied: 
W(21,%2,°°° 
E\W(X1, °° Xn, Vi, e+: » Your hh + A, be) 
~ 1. >... x. 
RRs. +++ Beas, -* Maks 
— Wins, ***, Ans 


(B 


where My, , My and Mx are certain fixed constants. 
There exist sequences {t;} and {l;| such that for every set of x’s and y’s, 


sup |W(x, ---,25; 
Stj;ok 
<ljeke 


— Wits, *-- : Ba iths <** 5 tas O®@ 


Further, let (X,, ++: X,) and 94(¥1,--- , Yn) be estimates of § and n respectively 
such that given « > 0 and «& > 0, there exist numbers b, and bz such that for m 
and n sufficiently large 


(3.11) 
(3.12) 


Define 


~t ~4 
Uy = (") o oi.) 
8; 82 af . 


the summation being taken over all subscripts a, 8 such that 


is a<a@<-* <a“, & 


1lSA<h < +--+ < By 


(3.14) 
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Then ifm = Np and n = N(1 — p), 


lim pee _ lm, oo —_ 
(3.15) Nox L(y N Ly) = N+o L(y N (Ux — EU x)) 


= N(0, 0°), 


where ao ts the asymptotic variance of Uy and is given by 


’ Sy Se 
(3.16) © = — feo + — 
p i= 


Cor ’ 
p 


where 4 and fm have the same meaning as in (2.1). 


4. The asymptotic distribution of modified generalised [’-statistics. We 
are now in a position to consider the statistic Cy and obtain conditions under 
which it has the same asymptotic normal distribution as the statistic Uy . This 
result is contained in Theorem 4.1. 

THEOREM 4.1. /f in addition to the conditions of Theorem 3.1, 

(i) V/n(é — £) has a limiting distribution 
and 

(ii) A(t) = Ele(X, — t,---,X, — 0) & = O} has a derivative continuous in 
the neighbourhood of the origin, then 


(a) If A’(O) = 0, where A’ = =. BG). 
. dt 


lit a oe lir a . ’ 
nae L(Y nil - nae L(V/n [ll — EU,)) = N(O, s¢,). 


(b) If A’(0) ¥ 0, — ts asymptotically normally distributed and the joint dis- 
tribution of — and LU’, is asymptotically normal, then V/n(U,, — EU,,) is asymp- 
totically normally distributed. 


Proor. We have 


V/n{0,, — EU,] |. ae Mee wat ~ £) — EU.1. 


But A(é £) A(O) + (£ — E)A’(h) where h - F F) {Al < 1. There- 
fore 


Vnl€, — EU.) = ValC, — ACE — 8) + Vnlé — ©) - Ah). 

Since Wn(é — £) has a limiting distribution and A’(0) = 0, it follows from the 
continuity considerations and Slutsky’s theorem that +/n[C0, — EU,] and 
V/n{C,, — A(é — £)] have the same asymptotic distribution. But by Theorem 
3.4, % nlU, — A(E — £)| and W/n[U,, — EU,] have the same asymptotic nor- 
mal distribution. It follows that W/n{[C, — EU,] and Wn{U, — EU,] have the 
same usymptotic normal distribution. This proves (a). 

To prove (b), it is sufficient to remark that because of Theorem 3.1 and Slut- 
sky’s Theorem, the joint distribution of ~/n(~ — &) and Wn[0, — A(é — &) 
is asymptotically normal. Q.E.D. 
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71 

In the preceding theorem we make the following observations. 

(1) If A’(O) = 0, then o°(0,) = o (U,). 

(2) If A’(O) + O, then o (C,) a (U,), if and only if 

‘ ~ Delt)... £) 
A (Q) = 
a*\¢) 
where o(l',, &) is the asymptotic covariance between U,, and & and o°(é) is the 
asymptotic variance of &. 

For the sake of simplicity we will now consider the special case when s = 1, 
£ is the sample median and f(x) is symmetric about the median which may be 
taken to be the origin. 

Now 

A(t) = Eg(X — 0) 
= gir — tf x) dx 
= | ely)fly + ) dy. 
If there exists an integrable function g(y) such that 
f(y + t) — f(y + b&) | 
(4.1) , : 


< gly) 
t— & ae 
and the derivative of f exists almost everywhere except for a set of measure 
zero, then 


(4.2 A'(0 | gl pi ly dy. 


Also it has been shown in [11] that the joint distribution of U,, and & is asymp- 
totically normal and that 


. a 8 

(4.3) G\¢S ae 4nf2(0) 

and 

(4.4) a' U . é) = — [ le zt — —x)|f(x) dx. 
: 2nf(0) 0 


Hence a ( 0, 


o (U’,) if and only if 


(4.5) I | 4100 +f] le(x) — ¢(—zx)] f(x) dx 


We will now show that the condition (4.5) implies that g(r) — ¢(—xr) = 0 


almost everywhere. To show this, it is enough to consider the subfamily of 


0. 


Ill 
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probability densities given by 


(4.6) f(x, 0) = — ll, 

° 26 
We observe that the derivative of f exists everywhere except at the origin. Also, 
we have 


—|z+h|/6 —}x!/6 
fara é — ¢€ -}71/0 
(4.7) --—— = ce 


h/6 
for h sufficiently small, c being a fixed constant. Condition (4.1) is thus satisfied 
for the family of distributions (4.6). On substitution, condition (4.5) becomes 


a 
(4.8) oe *le(x) = _q= x)] dx Q, 
whence it follows from the unicity of the unilateral Laplace transform that 
g(x) — ¢(—x) = O almost everywhere, in which case A’(0) 0 and condition 
(2) reduces to condition (1). 

It is now clear that A’(0) = 0 is a necessary and sufficient condition that 
C,, and U, have the same asymptotic normal distribution. 

We will now extend the results of Theorem 4.1 to the two-sample problem 

THEOREM 4.2. Jf in addition to the conditions of Theorem 3.3, 


. rh ; Tapa _— Fe a ta 
1) VY N(E — &) and VY NG — n) have limiting distributions 
and 


(il) A(t; , to) = Ele(X, = t; See Te An -_ t; > Y; —_ a aa To - t,) E = 
n= 0} 


possesses first order partial derwatives continuous in the neighborhood of the origin, 
then 
(a) If 


dA(t, ts) 


‘ ! 
DA (ty, te) | 
- jas Caco = (), 
Ot; ty =to=0 Ole tet 


lim 


N-va £(/N (0. ~ £U,) lim ( 7 


Noo J/N[Us — EUyx}) 
= N(0, 0’), 


where ao is the asymptotic variance of Ux . 

(b) If the above condition is not satisfied, £ and 4 are asymptotically nor- 
mally distributed and the joint distribution of £, 4 and the U statistic is asymp- 
totically normal, then »/N[Cy — EUy] is asymptotically normally distributed. 

Proor. The proof of this theorem goes in exactly the same lines as that of 
Theorem 4.1 and is fairly obvious. Q.E.D. 
It may be remarked here that the results of Secs. 3 and 4 can be extended to 
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random vectors as also to functions of several U-statistics. The proof follows 
in exactly the same way as the theorem on the asymptotic distribution of a 
function of moments follows from the fact of their asymptotic normality [12]. 
We shall content ourselves by stating an analogue of Theorem 4.2 as applied to 
several U-statistics. 

TuroreM 4.3. With reference to the two sample problem, let 


GLU , Un, 2°? 5 Usy ly) 5 for, %1, > *?* » Ves), a7? 1, =? 6 
with s(y) S m and 3:(y) S n be gq real valued functions symmetric in u and in v+ 
Further, let 


sha 
We (Lays *** 5 Leetsys Yrs °° * » Wests 2 5 &) 


(y) 7 . 5 ly 
wag *?*5 La, as Dis *** 5 "34,(7 )}—A (ty ; ly ’ 


whe re 


» Kas es 


Vs, — fe, +++, Ya — | = 21 = 9) 


possess partial derivatives continuous in the neighborhood of the origin and W” 
satisfy the conditions (B;) and (B,) of Theorem 3.3 for y = 1, ---,q. Also let 
VN(E — &) andW/N(4 — 2) have limiting distributions where the estimates é 
and 4 satisfy the conditions (3.11) and (3.12) of Theorem 3.3. Define 


-. 
( m ) ( n 
8,(¥) 82¥ 


De (Xa, —§-** Aan >¥a — meets Yau — 0; 


af 


the summation having the same meaning as before. Then 
(i) a necessary and sufficient condition that the joint asymptotic distribution of 


/N(On — EUS), Lae J/N(CHY — EU’) 
he the same as the joint asymptotic distribution of 
VN(Un — EUY’) 


is thal 


fory = 1, 2, ><> 
(ii) A necessary and sufficient condition that the asymptotic distribution of 
\ i. cog — EU’) be the same as the asymptotic distribution of 


JN > 501 C (US — BUY 
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as that 


e ¢:* “hy by) = 0 


y=1 Ot, tj—t2—0 


and 


Z. C. 0A” (t,, te ) = 0. 


y=1 Ate ty =tg=0 


5. Consequences of Theorem 4.2. In this section, we will consider some of 
the tests of a class {Wy} based on a class of statistics {Cy} for testing the hy- 
pothesis that two populations differ only in location and investigate whether 
they are asymptotically distribution free. 

Consider first the test statistic T proposed in [2] based on a sample of m X’s 
and n Y’s. The test statistics may be defined as 
(5.1) LY} K(x, ¥)), 

mn = j= 
where 
eitherO0 < X < Y, 
(5.2) K(X, Y) = 1 if or «£2 €& 
= 0 otherwise. 


A corresponding modified test is then based on the statistic 
l1< - 
(5.3) 1 > EK (xX, ~ X, ¥;-— f), 
=1 j=l 
¥ and Y being the sample medians. Let § = 7 = 0. We then have 
A(t,, te) = EK(X —_ t, r we te) 

(5.4) x ‘ r0 i 
= [ (1 — Gla + t)| dF(x + t) + G(x + t) dF(x + t,). 

/9 . 


Also W(az, y, 4, ) = K(za —t,y — te) — A(t, te). It can then be shown that 
E| W(X, Y¥,t,,0) — W(X, Y, 0, 0) 


70 
<3] |F(e+t) — F(x)|dG(z) + 2| F(t) — FO)| 
<= dat, 
if the distribution function F has a derivative F’ bounded in absolute value by 
a. Similarly, it can be shown that 


E\W(X, Y, 0, tz) — W(X, Y, 0, 0)| < 5b, 


provided the distribution function G has a derivative G’ bounded in absolute 
value by 0. 
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Hence the condition (B;) of Theorem 3.3 is satisfied. Observing that K can be 
expressed as a difference of two monotone functions, it is easy to see that con- 
dition (B,) is also satisfied. Again, we have 


9A(t,, te) e ; 
Su = —f(W2Gte) — + | fle + 4) dG + &) 
Ot} “0 
«0 
- | f(z — t,) dG(x — te), 
HA (ti, te) cr ° 
= “5 = — | g(x + t2) dF(x + t,) + | g(x + te) dF(x + t,). 
Ole «0 —s 
Clearly, 
OA (t,, te) _ OA(ty, te) -0 
dt, t;=to=0 Ole t;=t2=0 


if f(r) and g(x) are symmetric about the origin. Conditions (a) of Theorem 4.2 
are satisfied. Hence 7 has the same asymptotic normal distribution as the 
statistic 7. This consequence is stated in Theorem 5.1. 

TueoreM 5.1. If the X’s and the Y's are distributed symmetrically about the 
respective medians and have bounded density functions, the test of the hypothesis H 
based on the statistic T is asymptotically distribution free. 

Consider now the test statistic suggested by Mood [3]. The test statistic may 
be defined as 


n . i 2 
(5.5) eae @ ats ‘) 


=1 - 


where r; is the rank of Y; in the combined sample of (m + n) observations. 
Noting that 


(5.6) r,=1+ D0 ¢(X;, ¥) + Lo(¥e, Y,), 


j=l 
where 
g(u,v)=1 if u<r, 
= 0 otherwise, 
it is easy to see that if m + n = N, and 
¥(u, v,w) = 1 if u<w and v<y, 


= 0 otherwise, 
(5.7) — = C,Ux’ + G2UxX? + C3UN? + p(t), 


where, C, , C2, C; are certain known fixed constants, P(1/N) is a third-degree 








76 BALKRISHNA V. SUKHATME 


polynomial in 1/N and 


OF af ) > d& WX;, X:, Yd, 


t jk 


—! -—l mm n 
= @) (5) Zz, » WX;, Y;,; Yi), 
o j ; 


UY = i? (") LL o(Xy, Vo 


1 1 é 


-statistics so that 


C,0. + ¢: OF + C:0% + P (x), 


are three generalised U’ 


(5.9) 
where C\? is obtained from U\ by centering the observations at the respective 
sample medians. Consider the statistic 0}. We have, 
A(t, te) = Eg(X al ti, Y — ts) 
F(x + t,) dG(x + t), 


W(x, y, th, te) = oz — hy — be) — AM (h, b). 


It can then be shown that 

E\W(X.3,1,.0) — W'°CX, Y, 0, 0)| S$ 2a, 
and 

EW (X, Y,0, tL) — W(X, Y, 0, 0)! S 2bt.. 


Condition (B;) of Theorem 3.3 is thus satisfied. Exactly in the same manner, 


it ean be shown that the condition (B;) is also satisfied by the statisties Oy” 
and CX. Condition (By) is also easily seen to be satisfied. Also we have 


BY, = 4.7, ~&,% - it) 


| F(x + G(x + tr) dG(z + by), 
as EYWX; hi, X; aa hy Y; — bh) 


= F’(x + t;) dG(x + t), 


9 4 . to) aca — 
dA (h, b) = 2 [ F(x)fiz)q(ax) dx, 


Oty ty=to=0 


pA” (t,, te) | eae 
04 4, &) = [ G(x)flx)g(2) dz, 


Ot; |t;=te=0 


(3) ) | 
0A (t1, te) | = [ s(x)g(x) dx. 


dt lt=te=0 





TESTING THE HYPOTHESIS 


- C, a. te tr) | 


y=l ot, t;=t2=0 


~ 0. 


Similarly, it is easy to see that 


aA (ty, tr) | 


= 0. 
Ot, t;=t2=0 


Hence, it follows as a consequence of Theorem 4.3 that the statistics M and 
M do not have the same asymptotic normal distribution. It follows that the 
test based on the statistic M is not asymptotically distribution free. 


6. Small sample behavior of the proposed test. It was shown in the previou® 
section that the test statistic T is asymptotically distribution free. We will now 
give some idea regarding the small sample behavior of this test by considering 
the simplest possible case, namely m = n = 3. The computations involved even 
in this relatively simple case are very extensive. We will consider the one sided 
test of the hypothesis 


H:3 = 
At > 1. 


We will consider some special alternatives and obtain the size and the relative 
efficiency of the Test T with respect to the corresponding best test for each of 
these alternatives. These results are presented in Table 1 


TABLE 1 


Relative efficiency of T test w. r.t. the 
" corresponding best test for— 
Population Size of T test ee — . 


6 =2 6= 3 








Normal 2 0.83 
Uniform 25 0.7 
Double exponential . .25 0.92 


From the above results we see that the size of the test remains more or less 
constant. The test is highly efficient for exponential alternatives and moder- 
ately so for normal and uniform alternatives. 


7. Acknowledgment. I wish to express my deepest gratitude to Professors 
Erich Lehmann and Lucien LeCum, who encouraged me to work on this prob- 
lem, suggested the topic and gave generous help and guidance during the course 
of the entire work. 
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ESTIMATION OF LOCATION AND SCALE PARAMETERS BY ORDER 
STATISTICS FROM SINGLY AND DOUBLY CENSORED SAMPLES' 
PART II. TABLES FOR THE NORMAL DISTRIBUTION FOR SAMPLES 

OF SIZE 11 < n S 15 


By A. E. SARHAN AND B. G. GREENBERG 
University of North Carolina 


1. Introduction. In a previous paper [2], estimation of the mean and standard 
deviation from singly and doubly censored samples drawn from the normal 
distribution were considered for samples n < 10. The generalization of an al- 
ternative estimate for these parameters was also obtained. 

In the present work, all calculations and tables obtained for the corresponding 
items in Part I are extended up ton S 15 

The method is to obtain the best linear unbiased estimates of the mean and 
standard deviation by taking the best linear combination of the ordered ob- 
servations. The variances and covariances of the order statistics for samples 
11 < n S 15 which are required in carrying out these calculations are obtained 
from Table I in [2]. 

Further investigation of the efficiency of the alternative estimate under varied 
degrees of censoring shows that the alternative estimate proposed by Gupta 
[1] is better than previously supposed when judged by doubly censored samples 
rather than singly censored samples alone. 


2. Tables. Table I gives the coefficients for the best linear estimates of the 
mean and standard deviation for the normal population from samples of size 
11 < mn S 15 undergoing all possible conditions of Type II censoring. Estimation 
from complete or singly censored samples are simply special cases and are given 
in the table 


(r; = re = 0, andr, orr, = 0). 


The best linear estimates of the mean and standard deviations are obtained by 
using 
n—T9 


= Zi QiY i» 


i=r,+1 
n—To 


oC = ) * G2; Ya ’ 
t—r,+1 


Ya < Ym <-°**, < Yo. 
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Table I. (Continued) 
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Table I. (Continued) 
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n=15 Table I. (Continued) 
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This table is a continuation of Table II in [2]. The entries in Table I, as well as in 
Table II of this present paper, have been rounded to four decimal places for 
convenience. Readers who desire more precision may obtain copies of the original 
tables containing eight decimal places from the authors. The results in the eight- 
decimal-place table are exact to seven places but rounding may cause some of the 
figures in the eighth place to be a few units in error. 

If the coefficients of an estimate are sought for a value of r; not given in the 
table, the same procedure can be followed as that mentioned in Part I of this 
series. 

The variances of the estimates and their covariances are given in Table IT in 
terms of o°. This table is a continuation of Table III in [2] and the results are 
given to only four decimal places for convenience 

Table III shows the efficiency of the estimates for every case of censoring 
relative to the corresponding estimate obtained by complete samples. 


3. Alternative estimate. The alternative estimate was proposed by Gupta 
{1} to replace the best linear estimate when sample sizes are greater than 10 and 
censoring was from one side only. This estimate was generalized to the case of 
double censoring in Part I of this present series. The variance of the alternative 
estimates and their efficiencies relative to the best linear estimate for samples of 
sizes 12 and 15 under every case of censoring are given in Table IV. 

The authors know of no instance where the alternative estimates have been 
compared previously for sample sizes this large. 


4. Comments. The conclusions mentioned in [2], Section 5, hold true here and, 
in fact, appear much stronger for increasing sample size. Several points are worth 
emphasis: 


(1) In estimation of the mean, the relative efficiency holds up—about 65 per 
cent or better—as long as the median value remains known. (For an even n, it is 
about 70 per cent or better as long as the two middle values are uncensored.) 
This result was anticipated because the asymptotic efficiency of the median is 


2/r = 63.7%. 


Another way of presenting this same finding can be seen clearly from Fig. 1 
which shows the relative efficiency of the best linear estimate of » under all 
conditions of censoring a sample of size 15 from the normal distribution. Each one 
of the curves shows the efficiency of the estimate of the mean for a certain 
number of known elements [k = n — (r; + re)] for all possible values of r; and 
r.. The efficiency attains its maximum whenever the middle element is known. 

(2) From Table III, one can see that, for fixed values of censoring from one 
side, the efficiency of the estimate of the standard deviation decreases approx- 
imately in equal amounts with each increment in the number of censored ele- 
ments on the opposite side. 

This is illustrated by Fig. 2 which shows the relative efficiency of the best 
linear estimate of ¢ under all conditions of censoring a sample of size 15 from the 
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normal distribution. In this figure, the graphs for r; 0, 1, ---, 12 show a 
parallelism as r; changes. Thus, for any corresponding value of rz the efficiency 
decreases by about the same amount for each change in the value of 7; . 

(3)-Using Table IIT again and reading the entries for o* in diagonal fashion, 
one can see that, for a given n and fixed uncensored sample size (r; + re 
constant), the efficiency of the best estimate of o is remarkably constant in- 
dependently of how r; and re are chosen. In other words, there is practically no 
difference in efficiency irrespective of the proportion of the relative censoring 
from either side. 

This can be observed very clearly in Fig. 2. The approximate horizontal lines 
show constancy of the relative efficiencies of ¢* for the known elements (k) of the 
sample whatever may be the individual values of r; and rz. 

(4) From Table III (and graphs similar to Fig. 2), one can construct the 
following table showing how the efficiency in estimating o* varies with the 
number of uncensored values for each sample size to serve as a rough guide in 


censoring. 


Rough quide for assessing approximate efficiency (per cent)* of estimate of o 


Number of uncensored observations in sample, ork = # — 
ple Size 


nn 
4 s 6 7 8 

11 ) 13 21 30 39 5D 6] 73 86 

12 12 19 27 35 14 ot 64 75 S7 

13 } 1] 17 24 31 39 48 57 67 77 100 

14 ‘ 10 15 22 25 36 43 51 5Y 69 7 89 100 

15 $ 9 l4 20 26 32 39 46 54 52 sO 90 100 
* These values are rounded averages of different combinations of censoring and are 

within 2 or 3 per cent in almost all cases. 


The information in this rough guide for censoring is also illustrated in Fig. 3. 
The efficiency in estimating o* for varying proportions of censoring is shown in 
the graph for samples of size 10, 15, 20 and large n. The latter value was obtained 
from Gupta [1] and represented single censoring. However, as stated previously, 
the efficiency in estimating o* depends primarily upon the proportion of un- 
censored elements irrespective of the side and can be used in this way. 

(5) Figs. 4 and 5 show the efficiencies of the alternative estimate from a 
sample of size 15 relative to the correspondingly best linear estimate for the 
mean and the standard deviation respectively. 

Judging from these figures, the worst efficiencies of the alternative estimate for 
estimating both the mean and standard deviations are attained for singly cen- 
sored samples (i.e., only 7; or rz = 0). Thus, the alternative estimate is relatively 
more precise when applied to doubly censored samples. The alternative estimate 
was proposed by judging the results of a comparison of efficiencies using a singly 
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(K= number of uncensored 
observations = n -(r, + r,)) 


E ff ciency 





° 2 a 6 8 to 2 
% 


Fic. 1. Relative efficiency of the best linear estimate of » under all conditions of censor- 
ing a sample of size 15 from the normal distribution. 
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Percentage 
Relative 
Efficiency 








Fercentage of Uncensored Observations 
Fic. 3. Approximate efficiencies in estimating o* for censored samples of size 10, 15, 20 
and large 7. 


censored sample. The present graphs show that the alternative estimate is even 
better than previously supposed. 

Also, one can observe that for r; = 0, the alternative estimate of ¢ is more 
efficient than the corresponding estimate of the mean. In fact for other values of 
r, , the efficiencies are much more concentrated for the former than for the latter. 
Again, the drop in efficiency for the estimate of the mean is much sharper than 
that for the standard deviation. This shows that the alternative estimate alsc 
appears better if one judges its value by considering its efficiency in estimation 
of the standard deviation rather than of the mean alone. 


Addendum. The extension of Tables I, II, and III of this paper are now avail- 
able for 16 < n < 20 in eight decimal places. Copies may be obtained from the 
authors. 
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Fic. 4. Relative efficiency of the alternative estimate of u under conditions of censoring 
a sample of size 15 from the normal population. 
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Fia. 5. Relative efficiency of the alternative estimate of ¢ under conditions of censoring 
& sample of size 15 from the normal! population. 
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GENERALIZATIONS OF A GAUSSIAN THEOREM 
By Pau 8. Dwyer 
University of Michigan 


1. Introduction and Summary. Plackett [1] has discussed the history and 
generalizations of the Gaussian theorem which states that least squares estimates 
are linear unbiased estimates with minimum variance. General forms of the 
theorem are due to Aitken {2}, [3] and Rao [4], [5]. The essence of the proof for 
Aitken’s general case consists in minimizing, simultaneously, certain quadratic 
forms involving linear combinations of the parameters. Plackett derived Aitken’s 
result by using a matrix relation. The proof of the theorem follows quickly once 
the relation is established. A somewhat similar but simpler matrix relation is used 
by Rao ({4], page 10). 

Aitken [2] and Rao [4], [5] obtain minimum variance with the use of Lagrange 
multipliers. Unless one has a method of working with matrices of derivatives it 
seems necessary to differentiate with respect to the many scalars constituting the 
matrices and to assemble the results in desired matrix form. Authors frequently 
give only the assembled results ({4], page 10, [5], page 17, [6], page 83). 

The question arises as to whether it is possible to use the logically preferable 
matrix derivative methods of minimization. It is shown below that the use of 
matrices of partial derivatives [7] leads logically to the solution without the 
necessity of changing to and from scalar notation, or without the necessity of 
establishing some relation which implicitly contains the solution. Matrix deriva- 
tive methods seem to be preferable methods for undertaking solutions of prob- 
lems of simultaneous matrix minimization with side conditions for the same 
reason that derivative methods are preferable to the use of some (unknown) 
relation in solving problems of minimization involving scalars. They may also 
be used in establishing the relation which may then be verified without their use. 

The paper includes generalizations of the results of Aitken [2)}, [3], Rao [4], 
[5], and David and Neyman [8]. It gives a general formula for simultaneous 
unbiased estimators of linear functions of parameters when the parameters are 
subject to linear restrictions and shows how the results are applicable to special 
vases. It provides formulas for the variance matrix of these estimators. It gen- 
eralizes a matrix relation used by Plackett [1]. It uses the matrix square root 
transformation in establishing the general result for the variance of (weighted) 
residuals when there may be linear restrictions on the parameters. It provides a 
generalization of a formula of David and Neyman [8] in estimating the variance 
matrix of the unbiased linear estimators. 


2. The least squares solution. The (inconsistent) observational equations are 


(2.1) AdO=2 


Received February 18, 1957; revised September 17, 1957. 
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and the true linear regression is given by 


(2.2) &(x) = Ad, 
where the values of z, A, and @ are real. We set 
(2.3) A§é—x=e 
so that 

(2.4) &(e) = 0. 


In determining the least squares regression we have 6(s X 1) as the vector of 
unknown parameters, z(m X 1) as the vector of measurements of the variable of 
regression, e(n X 1) as the vector of errors and A(n X s) as the matrix of 
measurements of the regressed variables. We take s < n and A of rank s. Further, 
under the usual regression condition of fixed A, 

, T T T 7T 
V = &(zz’) — &(z)E(xz’) = E(ee ) = J 
(2.5) 
= var (x) = vare 


is the dispersion matrix of z and e. We limit our discussion to the case where V is 
positive definite. A common dimensionless generalization of the least squares con- 
cept uses weights for the observations with W = V “ and leads to 


(2.6) Q = Ve = (AO — z)'V (AO — 2) 
as the form to be minimized. The value of 6 which minimizes (2.6) is known to be 
(2.7) @* = (A’V'A)'A’V “2. 


This result can be derived using symbolic matrix derivatives ([7], page 524). 
We have successively 





(2.8) Q = 0 A’V 'AO — OAV 2 — 2’ V 'AO+2'V “2, 

(2.9) “ = J7A"V 'A0+ OAV AJ — JTATV sc — 2" VA, 
Oo } 

(2.10) *@) = A’V'A0K’ + A’V 'AOK — A’V ‘cK’ — A'V'oK, 


a(Q 


and since Q is scalar, K = K* = 1. Setting @ = 6* when 30 | = 0, weget (2.7). 


We note that 6* is an unbiased estimate of 6 because of (2.2) and (2.7) 


3. Linear estimates with minimum variance. Now consider the k linear par- 
ametric functions 


(3.1) = Lé6, 
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where L = L(k X s) is known. Then ¢ = ¢(k X 1). We wish to find 
(3.2) ¢* = Lée&* 


such that ¢* is an unbiased estimate of ¢ with minimum variance. This means 
that the diagonal terms of var ¢ (a matrix of order k X k) should attain their 
minimum values simultaneously. Following Aitken we consider solutions of the 
form 


(3.3) ¢* = Bz 


and determine B = B(k X n). Rao [4] has shown that this homogeneous form is 
the general form. The relation 


(3.4) (BA — L)@ = 0 


follows from (3.3), (2.2), and (3.1) in accordance with the requirement that ¢* 
be an unbiased estimate of ¢. 

Aitken [3] has shown using Lagrangian multipliers and Plackett [1] using a 
matrix relation that the value of 6* in (3.2) which minimizes the diagonal term of 
var Q* is identical with the 6* resulting from least squares as given by (2.7). This 
Aitken theorem is a generalization of the Gaussian theorem that least squares 
linear estimators are unbiased with minimum variance. 

Rao [5] further generalized the theorem with a consideration of linear re- 


strictions on the parameters when k = 1. The argument is given below for the 
more general k. The preparation of the problem for minimization is similar to 
that of Rao in the special case with k = 1, though there are some modifications. 


The uw < s independent linear restrictions may be indicated by 
(3.5) g = Ro=0, 


where R = R(u X s) and g = g(u X 1), without loss of generality since any 
term not having some 6, as a factor may be multiplied by % = 1 and s replaced 
by s’ = s + 1. We premultiply by the undetermined D = D(k X u) to get 


(3.6) DR@ = Dag, 


in which the matrix coefficient of 6 has the same order as BA and L. Then the 
condition for unbiased estimation and the specific side conditions are incorpo- 
rated in the matrix relation 


(3.7) (L — BA)@6=0 = Dké — Dg 
so that the conditions for estimation can be written in the form 
(3.8) (L — BA — DR) = and Dg = 0. 


Specifically our purpose is the minimization of the diagonal terms of var ¢* 
subject to (3.8). Now 


(3.9) var¢g* = &(¢*¢*' ) _— &(6*)&(6*") = Bié(rx’) — &(x)8(x7)|B” = BVB." 
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We can then use 
(3.10) ¥ = BVB" + 2(L — BA ~— DR) A + 2Dgu", 


where y = ¥ (k Xk), A= A(s X k), and wp = uw (k X 1) and differentiate with 
respect to B and D. We have 
a = JVB" + BVJ” — 2JAA, 
0(B) 
aly’ 


—~ = KBV + K’BV — 2KA‘A’, 
dB 


so that the critical value, for each and every diagonal term, is given by 


(3.11) BV = A‘A’. 

Again 
2 
d(D) 
a) 
aD 
aii 
aD 


= —2JRA + 2WJogy’, 
= —2KA‘R’® + 2Kug’, 


= —2K,,A°R’ + 2Kj.u9", 


so that, for each and every diagonal term 
ATR” = yg’, 
so by (3.5), 
(3.12) = 0. 
From (3.11) we get 
(3.13) B= A‘’A*V". 
Substituting in the first equation of (3.8), we arrive at 
(3.14) A"A'V "A + DR=L. 


This equation and (3.12), for the special case with k = 1, were derived and 
emphasized by Rao [5], [17]. 

We next derive an estimate of ¢ in terms of A’ and 6* for general k. We just 
multiply (3.14) by 6* and use Ré* = 0 to get 


(3.15) A"ATV 'AG* = ¢*. 
The corresponding estimate in terms of A’ and z is 


(3.16) ¢* = Br = A’A'V"'s. 
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It follows that 6* satisfies 
(3.17) A7A™V 'Ao* = A’A'V'. 


Equations (3.17) and (3.12) may be considered to be basic relations in 6* and 
A’. 

4. The general Gaussian theorem. We next demonstrate the general Gaussian 
theorem that the value of 6* obtained by least squares is consistent with that of 
(3.17) and (3.12). We note first that 6* in the general solution is an s X 1 vector 
and that the general solution is obtained by premultiplying @* by the fixed 
k X s matrix L. The general theorem is established by proving the typical case 
with k = lsothat L, B, D, and A are vectors and (3.17) becomes 


(4.1) \7A7V 'AO* = N7AV"z, 
where X” is \7(1 X s). Also (3.12) becomes 
(4.2) \’R? = 0. 


. . oe | . ia 
Now we wish to minimize the scalar Q = « Ve, subject to the restriction con- 
ditions. Then 


(4.3) Q' = (Ad — x)’V ‘(AO — x) + 2(l — DA — dR)dA + 2yRO. 
Differentiation with respect to @ and d leads to the “‘normal’’ equations 
(4.4) A’V 'Ae* — A'V's + Ry’ = 0, 

(4.5) rR’ = 0. 

Premultiplying (4.4) by \’ and substituting (4.5), we get 

(4.6) NAV 'AO* = NTATV 2. 


Since (4.6) and (4.5) are identical with (4.1) and (4.2), the \’s and @’s must be 
the same, so the general Gaussian theorem is true. 

This solution, which is similar to that of Rao, is satisfactory in proving the 
generalized Gaussian theorem but it is not satisfactory in that it does not provide 
an explicit value of #* (only implicit relations involving the vector parameter 
A) nor does it give an explicit expression for the unbiased linear estimator having 
minimum variance. These are provided in the sections following. 

One further remark should be made before leaving these results on least 
squares The Eqs. (4.6) and (4.5) may be considered to be the normal equations 
of a general least squares problem expressed in terms of the vector parameter X. 
Comparison of (4.6) with (2.7) shows that these normal equations can be ob- 
tained from the normal equations of the problem with no restrictions by pre- 
multiplication by \” where \’ is subject to the conditions \7R” = 0. 


5. The explicit form of the estimator. It appears that no one has provided the 
explicit form for ¢* or for 6*. Post multiplication of (3.14) by (A7V'A)“R” 
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followed by application of (3.12) eliminates A’ with the resulting 
(5.1) DR(A’V™"'A)'R’ = L(A'V'A)'R’. 

Now since R(A’V'A)“R’ is of order and rank u, we can write 
(5.2) D = L(A'V'A)“R'[R(A7V A) RJ". 


The value of A’ is then from (3.14) 
(5.2) A’ = L(A’V'A)* — L(A’V'A)'R'[R(A'V A) RS" R(ATV A)" 
and from 
(5.4) B= L(A’V"'A)'A'V" 
oa L (A'V 4 y"R(R(A7V "A rrr R(A ‘y 14 ie V 1 
so that 
(5.5) ¢* = L(A’V'A)'A'V 2 
— L(A’V"A)'R’ [R(A7V"A)"R’]" R(A7V A) AV 2 
is the linear unbiased estimator having minimum variance, and 
o* = (A'V ‘A)'A’V 2 
(5.6) . 

— (A’V'A)'R’ [R(A'V A) "RJ" R(A'V A) ‘AV "2, 
and 6* is the explicit solution of the normal equations. Rao did not give an ex- 
plicit answer even for the case k = 1, since he did not derive an explicit formula 
for \”. The argument above covers the Rao case with L and \ vectors. Thus 
(5.5) and (5.6) hold with L a vector. As is pointed out above, the §* which re- 
sults from least squares and from minimum variance is independent of L. 

The results above are also general enough to include the Aitken results. These 
can be obtained formally from the above results by using the convention that 
2 [R(A’V"A)R’|” is O when R = 0, the formal equivalent of u = 0 side con- 
ditions. Thus the last terms drop from (5.5) and (5.6) for the Aitken problem. 

The above results also generalize those of David and Neyman [8] who placed 


specifications on the dispersion matrix V. They defined V to be a diagonal 
matrix with 


se 5 o 
(5.7) vi, = —, where P;, =-. 
I o: 


at 


The formula for ¢* then becomes 
¢* = L(A’PA)'A’Px 
— L(A’PA)"R’[R(A*PA) RR] "R(A7PA)"'A’Pr . 
Now B is ¢* with z = J, and @* is ¢* with L = I. 


(5.8) 








112 PAUL 8. DWYER 
If Pi; = oPi; with Pi; = 1/03 , we have 
o* = L(A’P’A)A’P’x 
— L(A’P’A)'R"[R(A’P’A)‘"R|"R(A*P’A) ‘AP's. 
Then dropping the side conditions on the parameters we get 


(5.10) B = L(A’PA)'A’P = L(A’P’A)'A’P’. 


(5.9) 


When L is restricted to a vector, this is the David-Neyman result in matrix 
form. 


When V = J, L = J and R = 0 we have the common case of unweighted least 
squares regression 


¢* = 6* = (A’A) A’e 
and 
(5.11) B = (A’A)'A’. 


The general results are immediately applicable to a variety of special cases in- 
volving specifications on V, specifications on L, and specifications on FP. sepa- 
rately or in combinations. 


6. The dispersion matrix of solutions. The dispersion matrix of solutions is 
var ¢* = BVB’. Using the value of B in (5.4), we get 


var (¢*) = BVB’ = L(A’V A) 'L’ 
— L(A’V~'A)'R[R(A7V A) R'|'R(A7V A) OL’. 


Whenk = 1, thisisan explicit result for the Rao problem. When there are no side 
conditions we have the Aitken result 


(6.2) var (¢*) = L(A’V'A) “L’. 


(6.1) 


When the values of xz are uncorrelated with vj; = o°/P;,, (6.1) and (6.2) be- 
come 


var (¢*) = L(A’PA)“L’ 


_ — L(A’PA)"R"[R(A*PA)"R’| 'R(A'PA) 7 
and 
(6.4) var (¢*) = L(A’PA)"L’o’. 

When in addition the variables have a common variance o°, ¢, = o and 


P = I. The Eqs. (6.3) and (6.4) appear with (A7A)™ replacing (A7PA)™’. 

If ¢ = 6, the above formulas appear with L = J. The simple case in which 
there are no side conditions, ¢ = 6, with variables uncorrelated but with equal 
variances gives 


(6.5) var (6*) = (A7A)“o’, 
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which is the formula for the dispersion matrix of regression coefficients in a 
common model. 


7. Use of a matrix relation. The results (6.1) and (5.4) enable us to write a 
relation involving the value of B which gives the value of BV B” having minimum 


diagnonal terms and the resulting matrix. In order to write this relation in 
compact form we use 


(7.1) C = (A7V"A)”" — (A7V"A)'R[R(A7V"A) RJ R(A7V A)", 
which is A with L = IJ to get 
(7.2) BVB” = LCL’ + (B — LCA’V™")V(B — LCA’V"’)’. 


The relation used by Plackett ({1], page 459) is a special case of this relation 
with the terms involving R deleted. Then C = (A7V~'A)~’. Plackett’s relation 
may be considered to be a generalization of the relation used by Gauss in estab- 
lishing the theorem. Once the relation is established we see at once that the 
diagonal terms of BVB’ are minimized for general B when 


7.3) B = LCA*V" 


as indicated in (5.4) and that the minimum values of the diagonal terms of the 
dispersion matrix are the diagonal terms of 

(7.4) BVB’ = LCL’ 

as given in (6.1). 

Once this general relation (7.2) is proposed, it may be verified by direct ex- 
pansion. Then the whole solution of the problem of the minimization of the 
diagonal terms of the dispersion matrix of the estimators is immediately avail- 
able as indicated by Plackett. If the relation is not known, and it has not been 
known previously for the general problem, it can be established with the use of 
matrix derivatives as shown above. 

The various special cases of the general matrix relation result from the ap- 
plication of specified conditions to V, L, and R. 


8. The variance of the residuals. Returning to the problem of least squares, 
we call &(e7V~‘e) the variance of the (weighted) residuals. Then « can be written 


(8.1) e = (ACA’V™ — Iz, 


where C is given by (7.1), and ACA’V", and hence ACA’V* — J, are idem- 
potent. Hence 


Ve = 2 (ACA7V — I)'V"'(ACA7V' — Dz 
= 2°V 2 — z7V 'ACA'V 2. 


There is no loss in generality, for purpose of derivation, in assuming that x in 
(8.1) and (8.2) is a deviate with var (r) = &(rz") = V. 
For the Aitken problem, C = (A7V~*A)™ and we have 


8.3) Ve = 27V x — xTV'A(ATV A) ATV 2. 


(8.2) 
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To this we apply the triangular matrix square root transformation’ 
(8.4) y = Wz with W'W = V". 


We then have 


(8.5) Ve = yy — y’WA(AV'A) ATW 
with 
(8.6) var (y) = E(Wrz"W") = WVW’ = I, 


so that, using the trace 
(8.7) E(z7V'z) = E(y’y) = n. 


In order to find the expected value of the second term on the right in (8.5), we 
use the additional transformation 


(8.8) z = Sy with S’S = WA(A'V 'A)'A'W’, 


, ; ‘ - . , T1y—1 laTiyT : 
where S is a triangular matrix. Since the rank of WA(A°V A) “A’W'™ is s, 
S is of rank s, and there are n — s rows identically zero. Then 


(8.9) eV e = yy — 272, 

and since 

(8.10) E(zz7) = | 
L 


then 
E(z7z) = s 
and 


(8.11) E(e’Ve) = E(y’y) — E(z"z) =n-—s. 


In the general problem with more complex C we have the additional quadratic 
form 


(8.12) a2’V7'A(A7V"A)'R’[R(A'V'A)'R'|'R(A'V A) - AV 
whose matrix is of rank u. Application of (8.4) followed by application of 
t = U,, where U’U = WA(A’V A) 'R[R(A7V'A) ‘RJ 
-R(A’V A) AW" 


(8.13) 


1 A triangular matrix square root, as applied to this problem, is a triangular matrix W 
defined by W7W = V-. This should not be confused with the (non-triangular) algebraic 
matrix square root defined by (V+)? = V-. 
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reduces this term to 
(8.14) t"t with E(t’) = 


Then 
(8.15) E(e’V"'e) =n—stu=n—(s—u). 


This result is what one would expect. If the values of x were distributed norm- 
ally, the positive definite quadratic form e’V~'e would be distributed as x° 
with E(X*) = n — (s — u) indicates the number of independent parameters. 

This result is independent of k. In the Rao problem, k = 1, and the value of 
E(e’V-"e) isn — s + was above. For the Aitken problem, u = 0, and the value 
isn — s. Where V’ = P/o’ we have 


(8.16) E(e’Pe) = (n-—s+t+u)0 
and when u = 0, this is 
(8.17) E(e’Pe®) =(n—s)o 


as shown by David and Neyman for the case of uncorrelated variables ((8}, 
pages 110-112). When P = TJ this becomes 


(8.18) E(e"e) = (n— s)o° 

as shown by Aitken using the properties of idempotent matrices ((3)], page 139). 
9. An estimator of the dispersion matrix of ¢*. David and Neyman [8] have 

provided an unbiased estimate of var ¢* for the case in which V* = P/o’, the 


x’s are uncorrelated and ZL is a vector. A generalization related to the David- 
Neyman formula for the general problem is, for known V, 


T1-1 
eV e 
n—-s+u 


since its expected value is the dispersion matrix of ¢*. 
, ye e ° ° ° ymr 
When V is known this formula is of little value since BVB’ can be computed 
and no estimation is necessary. However if V is not known, but P is, we have 


sl hen e’Pe 
(0.2) E- var (¢*) aa’ 
- L{(A7PA)* — (A7PA)“"R™[R(A"PA)“R'R(A*PA)}L’. 


When P = /, the case of equal variances, we have the important 


(9.1) E™ var (¢*) = £1", 


T 
€e 


(9.3) n—-s+u 


E™ var (¢*) = L{(A7A)™ 


— (A7A)"R[R(A7A)“R]“R(A7A) YL’. 
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In the case of no side conditions we have 





7 
(9.4) E™ var (¢*) = fo. L(A’PA)"L’. 
Using the value of B in (5.10) we get 
T 
(9.5) E* var (g*) = £** ppt. 


If L is a vector, the estimate is a scalar. In the David-Neyman scalar nota- 
tion, with the z’s uncorrelated and B a row vector (A) we have 
So Xr 


9.6) i=- —, 
(9.6) ” n — 8 im Pi 





where [A,] = \ = L(A’PA) ‘A’P = B. Hence (9.2) gives the estimator matrix 
of var (¢*) for a more general problem than does (9.6). 


Appendix Showing Orders of Matrices and Conditions 


Matrix Order Matrix Order 

x nX<1 vy rx 

A nx se A sxXk 

6 and 6* sx 1 Dy x1 

€ nX1 R “xXe 

V nxXn g “x i 

Q and Q’ 1x! D kxXu 

A'V"'A sXe us xi 

L k Xe BY 1xXu 

¢ and ¢* kx 1 R(ATV A)" RT? “xX 

B kXn iy sie 

var o* & ME r nXn 
BVB’ kxXk 


u<s <n,u = O gives Aitken problem, /: = 1 gives Rao problem, V-! = P/e? 
gives David-Neyman condition. 
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A CENTRAL LIMIT THEOREM FOR SUMS OF INTERCHANGEABLE 
RANDOM VARIABLES' 


By H. CHernorr anp H. TEICHER 


Stanford University and Purdue University 


1. Summary. A collection of random variables is defined to be interchangeable 
if every finite subcollection has a joint distribution which is a symmetric func- 
tion of its arguments. 

Double sequences of random variables X,,, k = 1, 2,--- tL, (x 


~; ’ 


n = 1, 2,---, interchangeable (as opposed to independent) within rows, are 
considered. For each n, Xn1,-°-- , Xax, May (a) have a non-random sum, or 
(b) be embeddable in an infinite sequence of interchangeable random variables, 
or (c) neither. In case (a), a theorem is obtained providing conditions under 
which the partial sums have a limiting normal distribution. Applications to such 
well-known examples as ranks and percentiles are exhibited. Case (b) is treated 
elsewhere while case (c) remains open. 


2. Terminology, notation and preliminaries. If X, is a sequence of r.v.’s con- 
verging in probability (in measure) to a r.v. X, that is, 


lim P{| X, — X| > e} = 0, alle > 0, 


> , 


we abbreviate this by writing X, —* X. This, in turn, implies g(X,,) — g(X) for 
any continuous function g(x). If the corresponding c.d.f.’s Fx,(x) — F x(x) at 
all continuity points of the latter (in the sense of convergence of real numbers), 


we say X,, converges in law (or distribution) to X¥ and write XY, — X. We shall 


use frequently without ado the facts that if X,, p X and c, is a sequence of 


positive constants such that c, — c, then ¢,X, p cX{3}. 


The notation P{A | B} will be used to designate the probability of an event 
A, given the occurrence of the event B, i.e., the conditional probability of A 
given B. 

We shall be interested in and deal exclusively with r.v.’s whose joint c.d_f. 
is a symmetric function of its arguments. The same will then be true of the joint 
Fourier transform or characteristic function. This characteristic may also be 
expressed by stating that the joint distribution of X,,--- , X, is invariant 
under permutations of the subscripts of the X’s. Such random variables seem to 
have been introduced by de Finetti (ef. [4]). They have been termed “‘sym- 
metrically dependent” by E. Sparre Anderson who also has studied some of 
their properties in a series of papers [1], [2]. By a quirk of terminology not in- 
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frequent in mathematics, independent identically distributed r.v.’s are then 
subsumed under the category “symmetric dependence.”’ On the grounds of 
brevity and connotation of the characteristic involved, we propose the sobriquet 
“interchangeable random variables’ to denote any finite set of r.v.’s whose joint 
c.d.f. is symmetric. 

In the case of an infinite sequence of r.v.’s, every finite subset of which has 
this property, Loéve speaks of ‘‘exchangeable r.v.’s.”” However, the terminology 
of “‘interchangeability”’ of r.v.’s will be extended to include this case as well. 

It is immediately evident that the r.v.’s, say X;,, Xi,,---, Xi,, of any 
finite subcollection of a collection of interchangeable r.v.’s (i.r.v.’s) are them- 
selves interchangeable and have a joint c.d.f. depending on r but not the permu- 
tation (%,,---, 7). In particular, the marginal c.df.’s Fj(z) = Fx,(z) = 
P{X, < x} are identical for 7 = 1, 2,---,k. 

It is worth noting at the outset that it is, in general, not possible, to embed 
a given finite set of i.r.v.’s in an infinite set of i.r.v.’s (or even in a larger finite 
set). For example, if P{X, = 1, X, = 0} = } = P{X, = 0, X: = 1} one cannot 
even adjoin a third r.v. so as to preserve interchangeability. 

We commence with some elementary observations on the nature of i.r.v.’s. 
Two of these will be cast in the form of lemmas. 

Suppose (as we shall throughout) that the i.r.v.’s under consideration have 
finite second ordermomentsEX ,X; = f*, f=, cy dF x,.x;(z, y),t,j = 1,2, --- ,k. 
It is of course sufficient for this that when m = 2, EXT = J®, 2" dF,(x) < &. 
Take py, = 1,7 = 1, --- , k and define the (common) correlation coefficient be- 
tween X, and X; by 
_ Cov (X,, X;) _ 


EX,X, — (EX,)(EX,) 














yo Fe — 


V 030% WVE(X; — EX)?E(X; — EX,)° 

_ EX\X:- (EX,)° ae 
E(X: — EXi)? ’ ” 

Then, the positive semi-definiteness of the correlation matrix 

Ross 

pl---p 


R={oejs=| | | 


hacia 


constrains p to be at least —[1/(k — 1)], where k is the number of i.r.v.’s. For 
if J is the k X k matrix consisting entirely of ones and J is the identity matrix 
of order k, 





\R| = |pJ + (1 — p)l| = [ko + (1 — p)I[1 — po) = 0. 


Thus, p = —[1/(k — 1)]. Consequently, the correlation between any pair of an 
infinite collection of interchangeable r.v.’s cannot be negative. 
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The following simple lemmas which we present without proof are useful. Let 
X and ¥ designate the vectors (X; , X2, --- , X,) and (Yi, Ye, ---, Ye). 

Lemma 1. Jf X,, X2,--- , Xx are interchangeable r.v.’s and Y = y(X) is 
defined by Y; = ®[X; , 9(X)], 7 = 1, 2, --- , k, where ® and g are Borel measurable 
functions, the latter being symmetric in its k arguments, then Y,, Y2,-+-:, Yr 
are interchangeable. 

Lemma 2. Jf Y = (Yi, Y2,---, Ys) is a random permutation of the inter- 
changeable r.v.’s X,;, Xo, --- , Xx, then Y has the same distribution as X. 


3. Background and Framework. The term ‘‘Central Limit Theorem”’ is a loose 
designation for one of an agglomeration of theorems dealing with limiting nor- 
mality of distributions of sums of random variables—in the classical treatment— 
independent random variables. 

The early results of De Moivre and Laplace have been succeeded by ever 
more powerful theorems set in an increasingly general framework. Recent works 
[5], [6] commence with a double sequence of rowwise independent r.v.’s (i.e., the 
r.v.’s within each row are independent) 


Xiu, 2 eo , Xm, 


Xa, Xe, -+- , Xaxe 


Bn , x wa oe ae Rein 


(where k, —> ©) and investigate the limiting distributions, i.e., c.d.f.’s of the 
row sums, say S, = > pes X,x . To render the problem more meaningful the 
r.v.’s are required to be “infinitesimal” (or asymptotically constant), i.e., 
lim max P{| X,;| > «} = 90, all « > 0. 
noe loisk, 

A famous theorem of Khintchine asserts that the class of limiting distribution 
of such sums S, coincides with the class of infinitely divisible laws [5]. A neces- 
sary and sufficient condition that the limiting distribution (assuming one exists) 
of sums of row-wise independent infinitesimal r.v.’s be normal is well known, 


namely, maxy< ick, |Xas | =. 0. (This actually implies infinitesimality here). 
lor purposes of comparison with Theorem 1 of the next section we state the 
following result of Raikov (ef. [5]): 

If Z,., * = 1,--- , ky are infinitesimal rowwise independent r.v.’s with zero 
means and finite variances o%, with > one = 1, a necessary and sufficient 
condition that the c.d.f. of 2 ae Zx converges to the normal c.d.f. with mean 
0 and variance | is that is x. EL L. 

Attempts have been made to relax the requirement of independence with 
varying degrees of success. Perhaps a natural and useful generalization is to 
double sequences of interchangeable random variables. 

In this direction, let X,;,7 = 1, --- , k, comprise a (finite) set of i.r.v.’s for 
every n = 1,2,-:-,. 
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If we stipulate that lim,.. P{|Xm| > «} = 0, all « > 0, the question of the 
nature of the class C* of all limiting distributions of row sums may again be posed. 
Clearly, C* includes all stable distributions but contains others as well. This 
follows from a result of von Mises [7] who showed that the distribution of the 
number S,.,, of unoccupied cells in a random casting of r, objects into n cells 
approaches that of the Poisson when n, r, —> © in a manner such that the ex- 
pected number of vacancies is constant. If the expected proportion of vacancies 
converges to a constant, then Irving Weiss [9] has shown that the limiting dis- 
tribution, suitably normalized, is normal. But S,,,, = >. i: Xa: where the X,, 
are i.r.v.’s assuming the values one or zero (according as the ith cell is empty or 
not). Therefore, the Poisson distribution and in fact all infinitely divisible dis- 
tributions belong to C*. 

In this paper, we consider only the case of limiting normal distributions and 
treat the first of the following two situations: 

(a) For each n = 1, 2,--- , the ir.v.’s Xn;i, 7 = 1, 2,--- , k, have a non- 

random sum. 

(b) For each n = 1, 2, --- , the ir.v.’s X,;,7 = 1,--- , k, are embeddable 

in an infinite sequence of i.r.v.’s. 
These cases are mutually exclusive since if }-%2, X.; = Ca, the covariance of 
any pair of i.r.v.’s equals —[1/(k, — 1)] multiplied by the common variance. 
But then their correlation is negative, which is precluded (under case b) by a 
prior remark. 


4. I.R.V.’s whose sum is non-random. For each n = l, 2 


me 2 th 
/ 


, k = 1, 2,---, kal o) be ix.v.’s with finite variance ca; 
= E(Xi, — EX%,) and satisfying the linear constraint 


Naturally, under such a proviso we must investigate partial rather than com- 
plete row sums. 
If we define 


X as - oe (x. = 


oni 


the X,, are, by Lemma 1, i.r.v.’s satisfying the relationships 


(i) > Xai = 0, 


i=l 


and 
(ii) EX%;= ox.; =1,:=-1,2,--- ,k, andalln = 1,2, --- 


, 


? The theorems obtained for the case of infinite sequences of i.r.v.’s overlap results of 
Professors Blum and Rosenblatt of Indiana University and will appear in a joint 
publication. 
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We suppose, therefore, without loss of generality, that for each 
-, {Xu}, & = 1,---,k& (> @) 


are rowwise i.r.v.’s satisfying (i) and (ii) and possessing the joint c.d_f. 
F(x, 22,°-*,2%,). We have then 


THEOREM 1. For each n = 1, 2, --- , let {Xni} be interchangeable random vari- 
ables satisfying (i) and (ii). If 


(1) max 
lsksk, 


1 kn 
(2) LX < 


and m, < k,, is a sequence of positive integers such that limy,.,, M/k, = a, 0 < 
a < l, then 


{ 


mn l z y? 
bm P< —— Figg i es =f exp | -———_ dy . 
n>o LV a? ” Vv 2r(l — a) «x I 20 —- a) y 

Proor. For any set of real numbers x,; , if maxi<i<x, |Lni| ‘x/k, = o(1) and 
lim,.« (1/k,) >-3" 23; = 1, then 


Lni 
max Sj _ = O1). 


4/3 a 


It follows directly that if the z,; are r.v.’s and the analogous conditions are true 
n “probability” the conclusion holds “in probability.”” That is, (1) and (2) 
imply 

|Xni|_ P 

(5) max 


LAX 
isk, V x x?, 


i=] 


~ 0. 


Next, let Yn, «++ , Yns, be a randomly selected permutation of Xa1,--- , Xnx,- 
Then even when it is stipulated that X,; = fixed real number 2y,, 
7=1,2,---,k,, the quantity 

kn —1/2 mn 
-(Su)" By. 
i=1 i=1 
is a random variable. 

Suppose that for some c.d.f. G(u) and arbitrary e« > 0, there exists 6, 
and integral N;(e) (all independent of zn; , --+ , nx) such that 
| Sus 
max —7;- 
1 


1<isky 


= <6, 


| 
Eni 


implies 


(6) IP{U, < ul Xa; = tar, t = 1, 
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for all n > N,(e) and continuity points u of G(u). By (5), there exists N2(e) 
such that for alln > N2(e), say 


= ( Xni | - 
(7) e>PJ) max —<—< = > 6,| = P{A,}. 


> me 


Then, from (6) and (7) for arbitrary « > 0 and n > max[N;(e), N2(e)] and con- 
tinuity points u of G(u), 


| P{U, < u} — Glu) | 


[P{U, < u| X.; = 2ai,t 1, ---,k,} — G(u)] dF, (m1, ---, x%,) 


J Rk 


<[ |P{Us <ulXe,i =1, +++, ke} — Gu) | dF, +/ dF, <2 


“An A 


an 

For simplicity in writing, let Q be an r.v. with c.d.f. G(u); then for A > 0, 
Q, = (1/A) Q is an r.v. with distribution G(Au). Under the proviso (6), (8) shows 
that 


On the other hand, according to sdbseadl 


VEER 


n i=l 


Consequently, (see, e.g., [3]), 


= /kn k L 

_———. Y= > ~ Us — 
A te V m YEN 

But by Lemma 2, > on Y,,; and > X,, have the same distribution, and thus, 

under the proviso (6), 


(9) 


= Xai 

Ta Mn > . 

It remains to verify (6) for G(u) the c.d.f. of Q = No,aa—a), Where N,..2 repre- 
sents a normal random variable with mean yu and variance o*. To do so, it suffices 
to prove that 


ka —1/2 m, 
(10) i. = (s 2.) y = —s -Q= Renna ’ 


i=1 i=! 


providing Yx1, Yao, --* , Yas, is a random permutation of the fixed real numbers 
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-1/2 
1.) Z| = o(l) 


and >-‘2; 2a; = 0. A theorem of Noether [8] states that the distribution of 


TEICHER 


Int; Ine, » Where 


Bas *** 5 


(11) max 


lSk<kp 


(x 


i=] 


L 
*n 


>» dni V ws 


i=1 


L, = 
converges to the normal distribution (when normalized by its mean and standard 
deviation) if the d,; are fixed real numbers such that 


1 & 


Zz (dn; — d,)” 


kK, i=1 


b 


Da» = 


os i.)| 


> (dni 


i=1 


= O11) 


> (Tai — aT 


Kn aj2 
x (Tai eT 7.0 | 


i=l 


o(1), 


Aw f 


with dy = 1/ka >. i2id,;and Z, = 1/ka)> imi tes = 0. Letd,; = lforl Sis 
and O form, +1351: s k,.Then D,,, = (1) fore = 3, 4, 
from (11), 


My 


- . Furthermore, 


z HP) max Lni 
i=1 l<ixsk, 


kn a/2 = kn \ 8-2/2 
> 24.) (x 24.) 


i=1 i=1 


nme 


A = o(1) fors 


( 


Thus Noether’s theorem applies to L, = 2 Y,, Whose mean and variance 
we shall show to be given by x, = 0 and 


(2) 


- No. and the desired result 


milk, — 


ce 


M,, 


(12) —, 
— ] 


Tn 


si L 
Then we shall have L,/cn - 


... 


U. = Ln /midkn - M,) 
Tn Bit. — 1) 


We now conclude by evaluating yu, and o, . 


E(Yo) = ¥ tee/ ke 


a=1 


0, E(Y%,) = 


E(Yni Yns) = (Do na Xn) /Knlkin — 1) 


a-#b 


—- >. Zna/ Kilkee = 


a=] 
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Hence 


i=—1 


—=y\ . (m, — 1) 
2=E Yu} = 4,)(m — ~___—* }, 
; (> ) (x2 ) k,  Kea(Re — 1) J’ 
which matches (12), concluding the proof. 
Coro.uary 1. For each positive integer n = 1,2, --- , let {Xa;},i = 1,2, ---, 


k,(—> ©) be t.r.v.’s satisfying (i) and (ii). If m, < k, is a sequence of positive 
integers with limy.. M/k, = a,0 <a < land 


3) E[X%1] = o(ka), Cov(X%,, Xia) = o(1), 


then the conclusion of the theorem holds. 
Proor. For any 7 > 0, 


( 4 k , 
») — [Xael S|) iy [ LXee 
“fh wer 97" (o| >a 


\ 


> (| os ) 
bk F \ Vk. - " < 


and 


E b- (x?,, — »] 
1 


Pie he D "”=— "7 


ee n 


_ ky E(X5n — 1" + kalkn — 1) Cov (X4i, X45) _ 


Be o(1). 


CorOLLARY 2. For each n = 1, 2, ---, let ie. t= 1, 2,---, ka(— ©) be 

irvs with >it: Xac = Cy and >t, (X1,)° = Di > 0. If 
| Xa Dame C./kn | a. 0 
istshe V1/kn(D2 — C3/ha) 

then the conclusion of the theorem holds for(1/+/m,) >-7" Xai, where 
= Xn — Cn [ky 
~ [(/kx)(DE — C2/ka)] 
Proor. Condition (2) is certainly satisfied since 1/k, >." Xix = 1. 
Corouuary 3. For eachn = 1,2, --- , let {Xas},t = 1, +--+ , kn(— ©) be t.r.v.’s 


with EX. = 0, EX%; = 1 and X, = 1/k,n Do" Xu. If the {Xai} satisfy (1), 
(2), and 


(4) E(XmXnx) = 0(1), 


X ni 
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then 


Him P| -¥ Xa £,) <2} = 


neo 


V2r(1 - 


Proor. Let 


= i. = Foi — S2 = a(X = 
“fi-i-,-_. 
Then, applying Lemma 1 with g(X) = X, it follows that the {Y,,} are i.r.v.’s. 
Further, >“42; Yn: = 0 and EY%,; = 1. Since 
0 <= max Za) < =e. max | X 
= isish, Wha a ls isk, ee 


(1) and (4) imply 


Next, for every « > 


— 
That is, X, — 0. Thus, 


> v2; =; (ky — 1) 


Kn i=l (1 — EXnXn2) 


A direct application of the theorem to the {Y,,} shows that 


ka : a 
Sa :s a one a Ble 
W/ aD Se oF D(X X») Nou 


which, in view of (4), implies that 


™n e ‘ 
= 2 (Xas ae a * Nese - 


ln i=l 
Coro.uary 4. For each n = 1, 2, ---, let {Xai}, 2 = 1,---, ka, be t.r.v.’s 
with EX, = 0, EXi = 1.J1fm, isa sequence of positive integers such that 
lim, m,/kn = a,0 < a < 1, and the {X,;} satisfy 


(4’) Cov (Xm Xa) = -— + ofl) 


and either (3) or (1) and (2), then 
1 mn 


L 
“/m. 7 Xni —s No—a . 


Nn i=l 
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ProoF. Since, as shown in the proof of Corollary 1, (3) implies (1) and (2), 
it suffices to suppose that the latter obtain. But (4’) clearly implies (4) whence, 
according to Corollary 3, 


l & - -& 
“iy 2 Kes ve, vV/m.2.— Nor- 
Vm, =1 
However, for positive arbitrary positive, 
kn 2 
een ae > 
P{|/m,X,| >e} S$ 2 E (= X,) 
ek. i=1 
mm, 


en ao - 


= = o(1) 


eKn 


. , — "ta r ; ——, m , L Ty 
employing (4’). Thus, 1/m, X, — 0 and 1/+/m, So Xai — Noa-e- 


In this instance, not only does 7. te 0, but even 1/+/k, > ae es 5 0, which 
is perhaps more than might be desired. Note that (4’) automatically prevails 
if the X,,; sum to C, ; in fact, Cov(Xq:, Xn2) = —[1/(k, — 1)] in this case. 

Define Z,; = Xxa:/~k, . If (i) is replaced by (iii), EX,; = 0, and (ii) still 
obtains, then EZ,; = 0, >-‘%, 02,, = 1. Conditions (1) and (2) become 

me 
(1’) max |Z,;|— 0 


1< isk, 


and 


(2’) Y Z,— 1. 

i=1 
Then, in view of theorems cited in Section 3, the conditions (2’) implies (1’) (and 
correspondingly (2) implies (1)) for infinitesimal row-wise independent r.v.’s, 
satisfying (ii) and (iii). 

Of course, condition (i) precludes independence. Nonetheless, it should be 
verified for interchangeable r.v.’s satisfying (i) and (ii) that conditions (1) and 
(2) do not overlap. This may be seen from the following examples: 

EXAMPLE 1. Let (X,1, Xn2,--- , Xn.2n) be a random permutation of 


(Vn, — Vn, 0,0, --- , 0). 


Then >0i2; Xac = 0, 1/2n D-i2, Xu; = 1, but maxis icon |Xail/0/2n = 1/9/23. 
EXAMPLE 2. Let X = (Xn, Xn2, --- , Xn2n) = (0,0, --- , 0) with probability 
1 — p, — 1, and otherwise let X be a random permutation of 


(= ae dl ~—=) 
Vpn Vpn’ Vpn ”) Vial 
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Then 5°12; Xas = 0, E(X%,) = 1, and the X,, are ir.v.’s. Now 


[Xai] _ Xa] Po 


max —— = 

1<i<2n 2n 2n 
But 1/2n >-i2, X3.; = 6 with probability 1 — p, — 1 and hence converges to 
zero in probability. 


5. Illustrations. 

EXAMPLE 1. Quantiles. Let k, n be positive integers and U,, U2, ---, Usn-1 
independent r.v.’s each uniformly distributed on (0, 1). Take Uj to be the jth 
smallest of (U,, Uz,---, Uns), jg = 1, 2,°°-, kn — 1. That is, 
Ut Ss US S,---, S Utn_rare the order statistics from a uniform or rectangu- 
lar distribution. Designate the successive differences UJ — U%, by Vi, 7 = 
1, 2,-++ , kn, where Up = 0, Ur, = 1. 

It is well known that V;, V2, --- , Ven are interchangeable random variables 
adding up to one. In fact, any kn — 1 of them have a joint density 


f(r, oe °° gOeed 5 Feets °° * Vin) = (kn = 1)! for Dd ai v; < i. v; = 0, 
= Q, otherwise. 


A routine but tedious calculation or a non-routine exciting application of the 
Poisson stochastic process yields 
' —1 
oo kn — 1 — 
; 


A4(kn — 1)! 


‘(kn + 3)!’ 
— (kn — 1)! 

a[Vi Va] = os 
ELV, V2] (kn + 1)!’ 


aie 2(kn — 1)! 
wriva« So. 
mae (kn + 2)! 


Further, Vi, --- , Vin are i.r.v.’s and likewise Xm, --- , Xnx,, Where k, = kn 


and 
kn (v. — 5) 
— nk 


sear ces = 1,2, ->--,&. 


' V (kn — (kn + 17 
Moreover, > i21 Xai = O and ox,, = 1,7 = 1, --- , kn. The prior array of ex- 
pected values furnishes the estimates: 


EX‘, = O(nE |v. - | = O(n‘)O (4) = 0(1) 
kn n* 
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and 
Cov (Fe... 


= O(n‘) Cov (vs - ' a od Ly] 
-) P 
= _ aw 2 | 
, kn 
eos, — ' 
= O(n‘) {| zutVa - V2) EV i Ve +0 E(V,; ra | = Ee > Bw | \ 
v le vib J 


= O(n‘)O(n~) = O(n’). 


If, now, m, = n, it follows from Corollary 1 to Theorem 1 that 


ais wiopgntte tice titan sme te Ys 

Vn v (kn — 1)(kn + 1)7 Vn i=l 
has a limiting normal distribution with mean zero and variance 1 — 1/k. The 
same statement then applies to kyi/n(U, — 1/k). 

Thus, the sample quantile U, of order 1// in a sample of kn — 1 from a rec- 
tangular distribution is asymptotically normal with expected value 1/k and 
variance (k — 1)/k'n. 

Clearly, an analogous statement holds with 1/k replaced by any real number 
q in (0, 1). This conclusion extends to other distributions than the rectangular, 
e.g., if the c.d.f. F(z) has a continuous non-zero derivative at the unique solution 
xz, of F(x) — 1/k. These facts are, of course, well known. 

Note, in addition, that 


EX, X,2 = O(n?)E[(Vi — 1/kn)(V2 — 1/kn)] 


1 | 1 
O an sei he ee 
m) E on(kn + 1) ~~ kn kn (kn a | 


O(n*)O(n™) = o(1). 


Thus, if (for specificity) k = 2, an application of Corollary 3 yields the con- 
clusion that 








1 2n(U, — 3) > > ,/mtl ; | 
= — ni,| = 2 *) — 2 
Vn l a — 1)(2n + 1)7? - | vil Qn — 1 (X. x 


is normally distributed in the limit with mean zero and variance } where X, 
denotes the sample median. This appears to be new but hardly of overwhelming 
interest. A comparable result may be demonstrated in the case of a random casting 
of r, objects into n cells referred to in Section 3. 
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EXAMPLE 2. Ranks. Let R,, --- , Ry, be a random permutation of the integers 
(1, 2,---, k,n). Define 


Xn = a . 
12 
Then, (R,,--- , Re,) and (Xu, --+ , Xn.x,) each comprise a set of i.r.v.’s. More- 


over, 


ke kn 


> Xu = 0, > Xi: = 1, 


w= s—1 


max 3(k, — 1) — 0. 
1s She Viz < VES 1 la + 1) 


A direct application of Corollary 2 of Theorem 1 yields the limiting normality 
(mean 0, variance 1 — a) of 


k, +1 


My, R; - 


— 


ia] 


where lim,.,, ™n/k, = a,0 < @ < 1,a familiar result. 


REFERENCES 

[1] E. SparrE ANDERSEN, “On sums of symmetrically dependent random variables,’ 
Skand. Aktuarietidskr., Vol. 36 (1953), pp. 123-138. 

[2] E. SparRE ANDERSEN, “‘On fluctuations of sums of random variables,” I, II, Math. 
Skand., Vol. 1 (1953), pp. 263-285, and Vol. 2 (1954) pp. 195-223. 

[3] Haratp Cramtr, Mathematical Methods of Statistics, Princeton University Press, 
1946. 

[4] Bruno pe Finer 1, ‘‘La previsions; ses lois logiques, ses sources subjectives,”” Annales 
de l’Institut Henri Poincaré, Vol. 7 (1937), pp. 1-68. 

[5] GNEDENKo-Kotmocororr, Limit Distributions for Sums of Independent Random Vari- 
ables, Addison-Wesley Publ. Co., Inc., Cambridge, Mass., 1954. 

[6] Micuet Loéve, Probability Theory, D. Van Nostrand, New York, 1955. 

[7] R. von Misgs, ‘‘Uber Aufteilungs- und Besetzungswahrscheinlichkeiten,’’ Revue de la 
Faculte des Sciences de l’ Universite d’Istanbul N.S., (1939), pp. 145-163. 

[8] GorrrrieD NoETHER, On a theorem by Wald and Wolfowitz, Ann. Math. Stat., Vol. 20 
(1949), pp. 455-458. 

[9] Invinc Wertss, ‘Limiting Distributions in Some Occupancy Problems,’’ Technical 
Report No. 28, Applied Mathematics and Statistics Laboratory, Stanford Uni- 
versity, Office of Naval Research Contract N6onr-25140, 1955. 





LINEAR ESTIMATION FROM CENSORED DATA 
By R. L. PLacketr 
University of Liverpool 
1. Introduction. Suppose that a sample of n random variables is taken from 
a continuous probability distribution, whose density function is f [(y — u)/o]/e, 


where » and o are unknown. Arrange the variables in order of magnitude, and 
denote them by y1, yz, °°: , Yn, Where 


nn hs *** ~*~ BM. 


We shall discuss the problem of estimating yw and o from the k successive variables 
Yu sy Yusts *** » Ye, Where v = u + k — 1. This problem arises, for example, in 
life-testing, and some applications are described by Gupta [7]. 

When using the principal results derived here, the expected values of ordered 
variables are essential, but tables of these quantities for normal samples are, 
at present somewhat limited. However, recent studies by Berkson [1] have shown 
the importance of the logistic distribution, which closely resembles the normal, 
and some properties of ordered logistic variables are given in Section 2. We now 
turn to the main problem. If u and v are fixed, the best linear unbiased estimates 
of w and o can be calculated by least squares, given the expected value and 
dispersion matrix of the vector of ordered variables (Godwin [6], Lloyd [11], 
Gupta [7]. In general, special tables become necessary, and it seems desirable to 
obtain simple formulae when samples are moderate or large in size. This is 
achieved in Section 3, where asymptotic values of the coefficients of 
Yuy Yust,***, Ye are derived. An examination of the conditions involved is 
supplied in Section 4, by considering the limiting form of the maximum likelihood 
equations. Several illustrative numerical tables complete the paper. 


2. Ordered logistic variables. The logistic distribution is defined by 
(1) L = logtp/(1 — p)}, 


where p is the probability of a value less than L. Suppose that L(¢; n) is the 
ith variable in ascending order in a sample of size n from this distribution. Then 


' 1 w 
aed le i aille ni [ Pp Hy _ pyri 
& exp {wL(i; n)} G—Din—ats (, *~ ;) p (1 — p)* ‘dp 


(¢-— 1+ w)!(n —i — w)! 
(i — 1)! (nm — 2)! , 


Take logarithms, differentiate with respect to w, and put w = 0. The cumulants 
of L(t; n) are 

. d’ ; 7 ; da’ a 
(3) x(t; n) = awl log (¢ — 1)! + (-1) qa log (n — i)! 
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and are thus expressible in terms of polygamma functions, tabulated for 7 = 1, 
2, 3, 4 in [2]. For (¢ — 1) > (nm — 7), we obtain 


’ ; 1 lL 1 
(4) K,(2: n) = - nein ee el ———— — + eee a . 
(n —2+ 1) n—t+ 2) ie = §) 
; rT | l 1 
KAT, 2) = = —<l+—+ 7 . aT 
» yp 2 
ee) “ ) (7 — 1)?) 
(5) 
\o 
‘ 
l l 1 
—<] lie cages ee + ~~ Se 
a Oo (nm — 1)°*) 
‘ 
1 1 | 1 \ 
(0) K3\t; 2) = — ++ — -+ we + + ——?, 
(n —2-+ 1)8 (vn — 2+ 2)8 (2 — 1) 
(; 2x 62] l m | 4 1 
x(t; n) = — — §< 1 ~- — fe eee _> 
15 2 34 (a — 1)# 
- / 
é) : Rn 
: t , 45 l 
— 6<1+ ies ain ~—f see + meh 
2! me (n — 1)4 


Suppose that x(7; n) is the 7th variable in ascending order in a sample of size 
n from the probability distribution whose density function is f(x) and distribu- 
tion function F(x). Let @ be fixed, 0 < @ < 1, and define ¢ by 


8) a = Fd). 

We require the two following results. As n — ~, with ¢ = [na] 
(9) &r(i; n) = t + O(n") 

and 

(10) F{ér(i + 1; n)} — F{&xr(i;n)} = 1/n + O(n”). 


The proofs are based on the Taylor expansion of z, considered as a function of 
L, about the value L = x,(7; n). This, after expectation, gives 


, ‘ ‘ ») 2 (3 (4 oe ‘ 
(11) Ex(t;n) = x + 42a'Ke + 42 ks t+ gr (xg + 3x2) + -- 


| ' 


where x«”’ is the value at L = «,(z; n) of the jth derivative of x with respect to L. 
Now 


(12) k(t; n) tlog{(i — 1)i/(n — t)(n —i+1)} + O(n~*) 


whence 


(13) m(i;n) =A + O(n), 
where 

(14) \ = log{a/(1 — a)}. 
Also 


(15) K,(¢;n) = O(n*”) (j = 2,3,---). 
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Assuming 2” to be bounded, we can substitute (13) and (15) in (11) to obtain 
(9). As regards (10), we suppose that z® and x” are bounded, in which case 


(16) S&r(i + 1; n) — Ex(i; n) ial os ->2” + O(n™’). 


On the further assumption that df/dz is bounded, (10) results. 
We shall now consider the standard normal distribution in more detail. Denote 
its density function by ¢(z) and distribution function by #(r). Here 


(17) 2 = &(1 — %)/¢, 
(18) 2? = 2 {zr — (26 — 1)}, 


(19) 2 = (2) + Qer%r + 2 (1 — 26) — 27 H(1 — 4), 
(20) 2 = 5(2)?2? + 2 {arr — (26 — 1)} 
+ 22° {rr ~— 26(1 — &)} + 272(26 — 1)0(1 — 4). 


These derivatives are all bounded, their maximum absolute values being given 
below. 


xz? x? x4” 


0.62666 0.07376 0.06724 0.04597 


The absolute value of the remainder after (j — 1) terms of the series (11) is 


at most 8,;max | 2” |/j!, where 8; is the jth absolute moment about the mean 
of the 7th ordered logistic variable in a sample of n. Since 8; is known when 7 
is even and the inequality (8;)"? S (8;4:)""°*” is available when j is odd, we 
can thus assign bounds to &z(7; n) for all values of 7. As an illustration, take 
&2(19; 25). 


Series (11) to j terms Absolute maximum error 


0.642835 0.007656 
0.636781 0.002521 
0.636656 0.000262 


David and Johnson [5] express x as a function of , and the value for &2(19; 25) 
from the first four terms of the series on p. 236 of their paper is 0.636904. How- 
ever, their formula is arranged as a power series in (n + 2)’, and a similar 
rearrangement of (11) would be necessary before a full comparison of the two 
approaches can be made. This will be undertaken on another occasion. 


3. Least squares estimation. Let t; denote the expectation of (y; — )/c. 
Write 
(21) fi = fit), 
(22) y= F(t,), 
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and 

(23) q=1-— p; (¢=u.u+i1,---,v). 
Let m be the vector of (y, — w)/o for? = u,u + 1,---,v and put 

(24) t= &m 

and 

(25) V om &i(m — Em)(m — Em)’}. 

In principle, ¢ and V can be computed from the known function f(x). The esti- 


mate of 


(26) 6 = a 
o 


given by generalized least squares is 


(27) 6* = (A’'V 'A)TA'V Sy, 
where 
(28) A=fl 4d. 


As V is difficult to handle analytically, we replace it by W, a symmetric matrix 
whose elements are {a,b;} for 7 S 7, where 


(29) a; = pi fi (2 u,u+ i. teey v) 
and 
(30) b; = @j if (7 = uu + l. ae v) 


Since Dy ~ Wo'/n, the unbiased estimate 
(31) 0° = (A’W"A)'A'Wy 


may be presumed to have the same asymptotic properties as 6*. We therefore 
consider the limiting form of 6°. 
The inverse of W has been given by Hammersley and Morton [9]. Put 


(32) ay-) = Q, Apai1 = : b 1 = : ae — 0, 
Then 
r ¢ : d,, 0 0 07 
bd. Cups Guys O 0 | 
( be ces ( 
(33) art on Y dus Cug2 duze ) | 
0 0 ee exe | 
| 0 0 d Ce 
where 


(34) CGC; = (Qjsibi — ayabias) / (Qibian — Gignb)(aiibi — aid; 1) 
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and 
(35) ds = 1/(adis, — ay4yb,). 
Denote A’W by G, and define 


(36) h, 3 (Des — Ds-1) (is=utlut+2,---,r—1), 
(37) hu = Pusi — Dus 
and 


(38) h, = Do — Pv-1- 


If the elements of G are considered as functions of py , Pusi, +++ , De then gis 
depends on pu and pus 3 Gis ON Pr-1, Ds, ANd p,41 ; and gy, ON P,_; and p, . In 
Jiu, PUL Pusr = Pu + hu 3 in Qu, replace p,.. by p, — h, and p,4; by p. + hs ; 
and in gy , put pp. = p, — h,. The first and third substitutions are exact; the 
second one is approximate, but ifn — « with u = [na] and v = [n6], the values 
of p, tend to become equally spaced between a and 8, as (10) shows. The elements 
in the second row of G are treated similarly, so that both elements in the ith 
column are now expressed as functions of p, and h, . Expanding by Taylor series 
as far as h} in the numerators of g;, and g2, andas far as h* elsewhere, the elements 
of G finally reduce, after a good deal of straightforward algebra, to the expres- 
sions given below. The primes signify differentiation with respect to p, so that 


(39) d log f 
dx 


and 


(40) g* = Lee 
; dx’ 


In calculating the elements of A’IV~'A, we pass from sums involving A to in- 
tegrals involving dp. 


4. Maximum likelihood estimation. The likelihood of y, , yusi, --- , Ye is 


' ( = \u-l a f _ \ a—s 
(41) rr ll iP (* *); II L s(H—*), Dn r(u—#), 
(u— 1)i(n — vw)! o ine O o o J 


Denote by 4 and ¢ the maximum likelihood estimates of u and c, respectively. 
They satisfy the equations 


nF ( = *) tc. a 
G 


a 
o 


er (“ — *) 
(u ee g. i oa _' . r a 
(42) - 7 _@ oo oO d os i(¥ — 





136 R. L. PLACKETT 


TABLE 1 
Asymptotic value of A’ W 
Row—Column 


General Density 


c2 of 


i 


Normal Density 


ir. Pu + a + bhy 




















(1, u) Sul/Pu —Sufu — thufeSt 
(1, s) —hif 2 h, 
(1, v) Side +Sefo — thf So It/qe — tefe + the 
(2, u) tufe/Pu — tufufa — fu — Half + fuf Ute) tuf./Du + tifa — fu + hota 
(2, 8) —h.(f. + Sfrte) 2h.te 
(2, v) tft OF + bef ote . Se — the (fe 5 is Sf tte) tefe/Qe (ae tie . f . hety 
TABLE 2 
Asymptotic value of A‘ W' A 
al General Density Normal Density 
(1,1) | —SoLsf" dp + f3/qe + Sefe + Si/pu — fafa Se/de—teSe t+ Pe +fe/Pu + bufu— Pu 
(1,2) | —JSaL Uf" dp + tfe/qe + tebale + tfe/Pu— tafufa| teft/Ge — tafe — fe + tefe/Du 
+ tafutfa 
and 
(2, 1) 
(2,2) | —Sol ess" dp + tfe/qe + these + Pe | fe/ qe — tafe — tafe + 2pe t+ tife/Pe 
+ tafe/pu — tifela — Du + tifa + tufu — 2pu 
; (Yu — bs 
(u—- Dy - a 7 . soa aie R lies 
o ke 1 a dlogf fy; — zb 
——_ 4+ 1» wy, - Mest (ue 
nF (2: —#) a " 
, o 
(43) 
...f¥e — i 
(n — v) (Ye — pw) — 
s o 
foe 0, 
7 — 
nf —F (x= — 6) 
Co | 
where 


d log f (¥! = *) 

dx é 
means the value at (y; — 4)/é¢ of the function d log f/dr. The direct solution 
of (42) and (43) for normal samples has been described by Cohen [3], who used 
successive approximation; and, when u = 1, by Gupta [7], who calculated a 
special table which shortens the work. Halperin [8] has indicated conditions 
under which 

(a) the maximum likelihood equations have a consistent set of solutions 


a a. 
bh, @; 
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(b) Wn(é — uw) and ~/n(é — oc) have a bivariate normal limit distribution; 
(ec) the dispersion matrix of the limit distribution is best in the sense of 
Cramér [4], §32.6. 
The necessary assumptions involve derivatives of f[(y — u)/o]/o with respect 
to w and og, and we shall suppose henceforth that they are satisfied. 
We expand (y; — 4)/é in a Taylor series about ¢; and obtain 


in ee ( er e He - fi i 
_ u= 6 fhe, (w=# aa (EE a i) + 3 (4 ~ ) C> 
nm pe 3 Pe pus 2K OG ' 


a) 23 tnt (Ho # - i) 580 ~ 5 (# ve i.) D, 
TN imu | é 9 2 | 
+ R—8If 


f A , o A 9 | 
v _ ae v he , a % 
e 4 (w—# - (AE + ‘) 4 s(e—# — ) E>=0. 
n \ae CG qe Ge ~ G 


a {uu — 1)é ftufu 4 (2: wi —- t.)(& .' if.6 on tf) 
n \ Du C Pu Pu Du 
A 2 5 A a ? 
ead k F 
~_ 1 (4 = .) a | oe ad _ 2 z Lis 
c n TN imu \ 


A A 2 \ \s 
+ (= - ) IS + $0 + 3 (Ha — - i) Sp + 7 
c - a ) n 


f t f. — i i. ate : t, ; . us , | 
bods +(44 — 1 )(E 4% ae (ea 3) Ty = 0. 
\ eS o de de Qe “ C ) 


Here C, D; , E, R, S; and T are second-order derivatives with respect to x evalu- 
ated at points intermediate between (y; — 4)¢ and ¢; ; and the primes have 


bo 


(45) 


their previous meaning. 
We assume that the second-order derivatives of 


d log f d log f f f af af 


—_— ese’ q’ p’ @ 


? 


with respect to z, are functions of bounded variation. This condition is not 
satisfied if F(z) = 0 at a finite value of z, since then 


Z(9| 
— - > & 
| dx? \p/ | 


at the lower terminus of the distribution; nor if F(x) = 1 for finite x, since 


| a (4) 
— _ — co 
| dx? \q 
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there. However, all is well for the normal and logistic distributions, as the follow- 
ing table shows. 


Maximum absolute values of second-order derivatives 


Distribution d = f x d we f A . -- . 
Normal 0 2.00 0.30 0.30 2.00 2.00 
Logistic 0.19 1.00 0.10 0.10 0.50 0.50 


Let a and 8 be fixed such that 0 < a < 8 < 1. We assume that f(t) 2 ¢ > 0 


wherever F-'(a) S$ ¢ S F'(8). For any such f¢, let f, p and g be defined in 
accordance with (21), (22), and (23). Then 


(46) 2 = pq/f’ 

has a finite maximum, which we denote by 2; . We proceed to derive the form 
taken by the maximum likelihood equations when u = [na], v = [né]. and 
n> ©, 


Consider the variable 





(47) (Ha J 2) = %—«— te) — G—p) — te — @) 
G G 
Given « > 0, e« > 0, and e such that co > « > 0, 
f ia \ 
Pr¢ y-h_,| ate <Pr{\yi—u—tio| >a} 
(48) a _— 


+ Pr {\(4 — uw) + ti(é@ — o)| > @} + Pr {\¢ — o! > 6}. 


Vn (?: —p 
Z o 


with zero mean and unit variance (Cramér [4], §28.5). According to (9), (4; — 1) 


Typically, i = [np], andasn— ~, — ‘) is asymptotically normal 


‘ onl Jn a 2 ss ° . . 
is O(n) and so zs Z —— ) has the same limit distribution. Hence 
Z o 


(49) Pr f{lyi-— uw — tel > a} ~ 26(- n*6,/a2z) ~ 2(a2/n'€:)6(n*€,/02). 


Similarly, by the asymptotic properties of 4 and ¢, there exist finite quantities 
z2 and 2; such that 


(50) Pr (7 — gp) + t(é — @¢) > €2} ~ 2(oz2/n' 2 )o(n’ "’ O22) 
and 
(51) Pr {\é — o| > 6} ~ 2(o2;/n 7 6)p(n*€5/023). 


Consequently, asn — ~, 


f 


- = Yi — ph a + €) wt 12 
(52) > Pr¢ y — —t|>— >< 2(8 — a)n “ao d (z;/e)o(n “€e;/02;). 
\ ia 





i=u 


| o — 6€3) 
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Thus, given « > 0 and 6 > 0, arbitrarily small, we can find N such that 


! \ 
(53) > Pr{| ys ut _ | > <6 for n 


imu 


Therefore, by Boole’s inequality, 


—t;| Se for Came Rey 
i 
2 


1—6 for n= 


When this event occurs, the terms in (44) and (45) which involve (H># — 1, 
og 


are negligible compared with the remaining terms, because C, D; , E, R, S; and 
T are all bounded. We therefore omit the squared terms, and replace sums by 
integrals as before (cf. Hoeffding [10]), at the same time making an Euler- 
Maclaurin adjustment on the coefficients of y, and y, , so as to correct for bias 
in the estimates and bring the results into line with those previously obtained. 
The linearized form taken asymptotically by the maximum likelihood equations 
is then given by the coefficients in Tables 1 and 2 on replacing A; by 1/n through- 
out. We shall use yw’ and o° to denote the corresponding linearized estimates. 
The asymptotic dispersion matrix of the maximum likelihood estimates is o*/n 
times the inverse of the equations for y’ and o°. 

These results show, not only that the maximum likelihood estimates of u 
and o are asymptotically linear, but also that the best linear unbiased estimates 


TABLE 3A 
Coefficients of ordered variables when estimating the mean. 6* above, 6° below 





yi y x ve % x» yn ¥ 


8634 
2.1547 
6596 
—0.7487 
—0.2923 
—0.3304 
—0.1240 
—0.1418 
—0.0316 
—0.0394 
0.0244 
.0222 


pt a 
ee 


SR 
2 


S 2 
& 


z 


S 
pS 


g 


osoososoos 
” 


Ee 

tb 

|} ssessssssssses 
csesoesce 
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are asymptotically normal and efficient. In order to compute the coefficients of 
the ordered variables, only tables of f(r), F(x) and ¢; are necessary. For normal 
samples, Teichroew [14] gives ¢, to 10 D for n S 20, and an extension ton S 100 
with 24 D is being prepared (Ruben [12}). For logistic samples, explicit formulae 
have already been given. 


5. Numerical tables. Tables 3A and 3B refer to the estimation of 6 from the 
smallest k observations in a sample of size n = 10 from a normal distribution. 
They give the coefficients of y, , ye, +--+: , ye fork S 10 in 


* 
(i) the best linear unbiased estimate, 6* = <I ; 
o 
0 po 
(ii) the linearized maximum likelihood estimate, # =|", 
o 


Table 4 gives the coefficients of » and o in the expectation of 6°. Suppose in 
general that 


(55) so = Be 


Then B'@’ is an unbiased estimate of 6, and the efficiencies of its elements, rela- 
tive to u* and o* respectively, have been calculated from the table of Dm in 
Sarhan and Greenberg [13] when, as above, n = 10,u = 1, andv = 2,3, --- 10. 
These efficiencies never fall below 0.9998, a result which suggests that 6°, cor- 
rected for bias, can be used in place of 6*, with negligible loss of efficiency, for 
all sample sizes of practical importance. 


TABLE 3B 
Coefficients of ordered variables when estimating the standard 
deviation. 6* above, 6° below 


~~ 
- 


| y2 ys M4 ys ve x ys ye yi0 


—1.8608 | 1.8608 
—2.1366 | 2.0404 
3 |—0.9625 |—0.4357 | 1.3981 
—1.0767 |—0.4586 | 1.4738 
4 |—0.6520 |—0.3150 |—0.1593 | 1.1263 
—0.7190 |—0.3330 |—0.1611 | 1.1681 
5 |—0.4419 |—0.2491 |—0.1362 |—0.0472 | 0.9243 
—0.5374 |—0.2631 |—0.1414 |—0.0425 | 0.9499 
6 |—0.3931 |—0.2063 ;\—0.1192 |—0.0501 | 0.0111 | 0.7576 
10.4266 |—0.2175 |—0.1250 |—0.0498 | 0.0180 | 0.7740 
7 |—0.3252 —0.1758 —0.1058 |—0.0502 —0.0006 | 0.0469, 0.6107 


nh 


i—0.3513 _—0.1849 |—0.1114 |—0.0517 0.0022 | 0.0545 0.6218 
8 |—0.2753 —0.1523 —0.0947 _—0.0488 |—0.0077 | 0.0319 0.0722) 0.4746 
—0.2963 —0.1600 —0.0998 —0.0510 |—0.0069 | 0.0358, 0.0799 0.4830 


9 |—0.2364 |—0.1354 |—0.0851 |—0.0465 _—0.0119 | 0.0215 0.0559 0.0936. 0.3423 
—0.2539 |—0.1399 —0.0897 —0.0490 |—0.0122 | 0.0234 0.0602) 0.1009 0.3505 

10 |—0.2044 |—0.1172 _—0.0763 |—0.0436 —0.0142 | 0.0142 0.0436 0.0763.0.1172 0.2044 

—0.2196 —0.1231 |—0.0807 |—0.0462 —0.0151 0.0151 0.0462 0.0807 0.1231 0.2196 
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TABLE 4 
Expectation of 


pw? 


“ | ¢ “ 


0.9007 0.2560 —0.0962 
0.9574 0.1104 —0.0616 
0.9821 0.0539 —0.0450 
0.9962 0.0260 —0.0346 
1.0054 0.0108 —0.0270 
1.0119 0.0022 —0.0208 
1.0166 —0.0023 —0.0153 
1.0204 —0.0038 —0.0097 
1.0243 0.0000 0.0000 


IS Or ke W DO 


© oo 


= 
— 
ee ee ee ee 


| 


TABLE 5 


i/n hy ha 





-5000 0.4274 

3333 0.3013 0.3013 

. 2500 0.2316 0.2326 

- 2000 0.1879 0.1888 0.1897 

1667 0.1580 0.1588 0.1597 

.1429 0.1362 0.1370 0.1378 . 1378 

. 1250 0.1198 0.1204 0.1212 1212 

-llll 0.1068 0.1074 0.1081 . 1082 0.1082 
. 1000 0.0964 0.0970 0.0976 .0976 0.0976 


Table 5 also refers to normal samples. Used in conjunction with the relation 
(56) hy = havi, 


it gives the values of h, forn = 2, 3,--- 10 and 1 S 7 S n. That there is close 
agreement between 6* and 6° can be inferred from Table 5 in particular and 
(10) in general. 


6. Acknowledgements. I am grateful to Mr. C. J. Taylor for doing all the 
calculations; and to the referee for a correction to my maximal remainder in (11), 
and other helpful remarks. 
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SEMIMARTINGALES OF MARKOV CHAINS 
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Dartmouth College 


1. Introduction. We shall deal throughout this paper with absorbing Markov 
chains with a finite number of states. An absorbing Markov chain is one that has 
a set of ‘“‘boundary”’ states which once reached cannot be left, and such that from 
any state the process reaches the boundary with probability 1. The chain is 
given by the transition matrix P, with entries p,; . 

More precisely, a state 7 is a ‘“‘boundary” state if p;; = 1. The remaining states 
will be called “‘interior” states. We must require that it is possible to reach the 
boundary from every interior state, not necessarily in one step. We assume, 
that there are r absorbing states and s interior states. The set of boundary states 
will be called B, the set of interior states J. 

An upper semimartingale is a function on the states of the chain, such that 
the expected value of the function after one step from any state is greater than 
or equal to the value of the function at the state. A lower semimartingale is 
defined similarly, with the inequalities reversed. A martingale is a function on 
the states that is both an upper and a lower semimartingale. 

A function on the states can be conveniently represented by a column vector. 
Such a vector z is an upper semimartingale if Pz = z, a lower semimartingale if 
Pz S z, and a martingale if Pz = z. 

We assume that a set of nonnegative boundary values is assigned to the ele- 
ments of B, v; being assigned to state 7. We denote by U the set of all non- 
negative upper semimartingales and by U* the set of all nonnegative lower 
semimartingales having the right boundary values. Thus U is the set of all 
vectors such that 


(a) Pze2z, (b)z20, (ec) {2}; =v; for jeB. 


The set U* consists of the vectors satisfying conditions (b) and (c), and con- 
dition (a) with the inequality sign reversed. 

Throughout the paper {z},; will denote the jth componnet of the vector z. 
Inequality signs between vectors will assert that the inequality holds com- 
ponentwise. 

A representation theorem will be developed for all nonnegative semimartin- 
gales with the prescribed boundary values in terms of martingales of modified 
chains. A modified chain is one obtained by adding interior states to the bound- 
ary, and assigning value 0 to them. The representation is unique and leads to 
a simple geometric interpretation. U will be represented (except in certain de- 
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generate cases) by a convex cubic s dimensional polyhedron. In degenerate cases 
the polyhedron reduces to smaller s dimensional polyhedra, including an s-sim- 
plex in the most degenerate case. U* will be obtained from a reflection of U 
through the unique martingale. 

These results will be applied to a treatment of certain sequential games, and 
to discrete subharmonic functions. In the latter application we will see that 
discrete subharmonic functions can be expressed as convex combinations of 
certain harmonic functions. And it is well known that the discrete harmonic 
function for given boundary values may be interpreted as the expected final 
value of a random walk. Hence we have a method of obtaining all discrete sub- 
harmonic functions in terms of certain random walks. 


2. The basic semimartingales. Let T be a subset of J and denote by P(T’) the 
transition matrix obtained from P by changing the states in T into absorbing 
states. Let Q(T) = lim,.., [P(T)]". Then the zjth entry of Q(T), q,;(T) repre- 
sents the probability, that starting at state 7, the process will reach state 7 
before reaching any element of 7. Let q;(7’) denote the jth column of Q(7). 
Then since 


Q(T) = P(T)-Q(T), 
o.(T) = 12 Pges(T), - ¥ 
9, teT, 
we see that 
Pq(T) 2 q(T), eB. 
Thus g;(7) is an upper semimartingale. It has the boundary value of 0 on all 


states of B except j, and has the value of 1 on this state. Thus the vector 2(T) 
given by 


2(T) = > v; q(T) 
j=l 


is a nonnegative upper semimartingale with the prescribed boundary values; 
1.e., for each T, 2(7') is an element of U. We shall refer to 2(7’) as a basic upper 
semimartingale. 

The vector z(7’) may be interpreted in a game played as follows: The process 
starts in a given state, and continues until it reaches a state in T, or a state in 
B, and is then stopped. If it stops at a state 7 in B the player receives v; ; if it 
stops at a state in T, he receives 0. Then {2(7')},; represents the expected value 
of the game to the player starting at state 7. We shall appeal to this interpreta- 
tion for certain simple results, rather than give detailed proofs. For example: 

Lemma 1. Assume that T; and T; are subsets of I such that T; € T,. Then 
2(T;) = 2(T>). 

From this interpretation we can easily determine z(¢) and 2(J). If T = ¢, 





SEMIMARTINGALES 145 


then our game is always played till the boundary is reached, and hence z(¢) 
is the unique martingale with the prescribed boundary values. If T = J, then 
we can never reach the boundary from J. Hence 


%;5 1eB 
(2D) = 4" aie 
It can be seen from Lemma 1 that 2(¢) and z(/) are the largest and smallest 


z(T), respectively. Since we wil! see later that all elements of U are convex 


combinations of 2(7')’s, we see that z(¢) is the maximal and z(/J) the minimal 
element of U. 


3. A special case. We shall first solve the problem of describing U for the case 
where the following hypothesis is satisfied. 

Hyporuesis A: The boundary values v; are all positive, and for any state i in I 
there is at least one j in B such that p;; > 0. 


In the case that hypothesis A is satisfied the game interpretation for z(T) 
makes it clear that the following lemma holds. 
LemMa 2. Under hypothesis A, the z(T) have the property that 


{Pz2(T)}; = {z(T)}; > 0 forieIl—T 


and 


[Pz(T)}; > {2(T)}, = 0 forie T. 


Thus for each component of z(7) exactly one of the equalities in the defining 


conditions (a) and (b) of U holds. 

LemMA 3. Let x, %2,°*+, 2n be distinct nonnegative vectors. Let W; be the 
set of components of x; which are 0. Assume that if W; © W, then xz; = x . If so 
the vectors are converly independent. 

Proor. Assume that 2; = >, a.% with a, > 0 and k ¥ i and >>, qm = 1. 
Then a component of z; can be 0 only if all the x, have this component 0. Hence 
W,; C W,, and 2; = 2, for all k. But this can only be true if x; = 2; for all k, 
contrary to hypothesis. 

Derrnition. A convex n-dimensional polyhedron is cubic if in every 7 dimen- 
sional face for each 7 — 1 dimensional subface there is a unique nonintersecting 
j — 1 dimensional subface (7 = 1, 2, +--+ , n). 

TuHeoreM 1. If hypothesis A is satisfied, then U-is a convex cubic polyhedron 
with 2° corner points. These corner points are the 2(T) for T € I. 

Proor. We observe first that the 2° z(7’) are distinct and convexly independent. 
This follows from Lemmas 1 and 3. We shall now prove that the convex set 
spanned by the z(7’) is a cubic polyhderon. 

A j dimensional face of the convex set spanned by the 2(7) is determined by 
picking any r — 7 interior states and requiring that one of the equalities 


(a) {P2(T)}; a {2(T)};, 
(b) {2(T)}; = 0 





146 JOHN G. KEMENY AND J. LAURIE SNELL 


hold for each 7 in the set chosen. To obtain a j — 1 dimensional subface of this 
face we impose an equality on one more component—say k. It follows from 
hypothesis A that Pz > 0. Hence it is not possible to have equality (a) and 
(b) for the same state. Hence this 7 — 1 dimensional face cannot intersect the 
face obtained by choosing the other equality for the kth component. By Lemma 
2 we can find a z(7’) which has any prescribed set of equalities one for each of 
the pairs (a) and (b). Thus any 7 — 1 dimensional face obtained by choosing 
an equality for a component 7 ~ k must intersect that obtained by choosing an 
equality for component 7. Thus the set spanned by 2(7’) satisfies the conditions 
for a cubic polyhedron. 

To complete the proof of the theorem we must show that if z is in U then 
it must be in the cubic polyhedron spanned by the basic upper semimartingales 
2(T). But if z is in U it must satisfy 


(a) Pz 2 2, 
(b) z= 0. 


Thus for each interior state 7 it must lie between the hyperplane obtained by 
requiring {Pz}; = {z}; and the hyperplane obtained by requiring {z}; = 0 
But this means that z must lie between each pair of opposite faces in the cubic 
polyhedron spanned by 2(7'). Hence it must lie inside of this polyhedron. 

Derriition. A sequence JT) C 7; C --- C T;, of subsets of J is called a chain. 
The corresponding sequence of corner points 2(7), 2(T1), --- , 2(7%) is ealled 
a z-chain. If k = s, the chain is called maximal. 

It is clear that the elements of a z-chain are linearly independent and hence 
span a simplex. A simplex spanned by a z-chain will be called a z-stmplez. 

Lema 4. Every face (of every dimension) of the cube U has a maximal element. 

Proor. In the s-dimensional cube U, every j-face (j = 0, 1, --- , s) isag di- 
mensional cube. This is clear from the definition of the cubic polyhedron. The 
face of the cube is specified by imposing equalities of type (a) or (b) on r — 7 
components. 

Since we have a polyhedral set, it suffices to show that there is a maximal 
corner. The corners are specified by imposing equalities of one of the two types 
on each of the 7 remaining components. It is a direct consequence of Lemma 1 
that a corner z(7’) is maximal if its J is minimal. Hence we get a maximal corner 
by imposing equalities (a) on all 7 of the remaining components. 

Lemma 5. The intersection of two 2-simplexes (if not empty) is a z-simplex which 
is a common face of the two original simplezes. 

Proor. Let To) C T; C --- C Ty and T) C T; C --- C Ty. Let the two 
simplexes be determined by the corresponding 2’s. If there is a nonempty set of 
T’s that the two chains have in common, then they span a common face. We 
will show that this is the intersection of the two simplexes. 

It will suffice to show that all the remaining corners of the second simplex 
(if any) lie outside the first simplex. Let T’ be one of the sets in the second chain 
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that is not in the first chain. If 2(7”) lies in the first simplex, then it is a convex 
combination of its corners. But this is impossible, since the 2(7)’s are convexly 
independent. This completes the proof. 

Lemma 6. Every point of U lies in at least one z-simplez. 

Proor. Let zo be a point of U. Starting with ¢@ we will construct a chain so 
that zo will lie in the simplex spanned by the corresponding z-chain. 

First of all, draw a line from z(¢) through zp and continue it till it hits a face 
of U (of dimension less than s). Say it meets this face in the point z,; . Then Zo 
is in the set spanned by 2(¢) and z, . In this face we pick the maximal point 
2(T,), which exists by Lemma 4, and draw a line from it through z, till we hit a 
face of lower dimension at a point z2. Since z lies in the set spanned by 2(7;) 
and z,, we know that zp is in the set spanned by z(¢) and 2(7;) and z. We 
iterate this procedure until some z, turns out to be a corner 2(T7,,). This must 
happen, since the dimension of the face decreases at each step. Then we will 
have Zp in the set spanned by z(@), 2(7;), --- , 2(7;.). 

At each step we first introduced the minimal T in the face, hence the T’s 
are monotone decreasing and hence form a chain. Thus the corners we found 
form a z-chain and the set they span is a z-simplex, which contains 2p . 

THEOREM 2. Any 2 in U can be written uniquely as 


k 
zw = 9, a;2(T)), 
3=0 
with a; > 0 and > a; = 1, where the z(T)’s used form a z-chain. 

Proor. Let zo be any point in U. By Lemma 6 it lies in at least one z-simplex 
Form the intersection of all z-simplexes that contain zp . This intersection is not 
empty and hence by Lemma 5 it is a common face of all the z-simplexes. This 
smallest possible z-simplex serves the purpose of our representation. Its corners 
form a z-chain, and we can write z as a convex combination of these. The weights 
a; must all be positive, or else the point zo would lie in a smaller z-simplex. 

To show the uniqueness of our representation we need only recall that the 
representation of a point in a simplex in our (barycentric) representation is 
unique. To get a representation of our form, the z-chain used must span a simplex 
containing ze. Hence the minimal simplex is a face of it. Hence the a;’s can be 
all positive only if the simplex is the minimal one we found. This establishes the 
unique representation. 

It is worth remarking that the theorem established only the uniqueness of the 
smallest z-simplex containing 2 . If this simplex is of a dimension smaller than 
s, then it is a common face of several z-simplexes. If hypothesis A is satisfied, 
then there are s! maximal z-chains, and correspondingly s! maximal z-simplexes. 
The cube is divided into these, and they overlap only in that they have common 
faces of lower dimension. If a point is in the interior of one of the maximal 
simplexes, then it is expressed by puiting positive weights on all s + 1 corners. 
If it is on a face, we must apply the same consideration to the smaller simplexes 
in which it lies. 
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4. The general case. If we drop hypothesis A, most of the previous considera- 
tions still apply. However, the argument as to the distinctness of the 2(T)’s 
breaks down. But by a continuity argument we can see that any case where the 
hypothesis is not fulfilled is a limiting case of ones where the hypothesis holds, 
and hence the worst that can happen is that some of the z(7’)’s coincide, and 
hence we have fewer corners on U. While it is still a polyhedron of dimension s, 
it need not be cubic, and there will be fewer distinct z-simplexes. We will show how 
the distinct z-simplexes can be found in the general case. 

Let B* be the set of boundary points which have nonzero values assigned. 
We assume that B* is not empty. 

Derinirion. A set 7 is fundamental if from any point in J — T it is possible 
to reach B* without going through 7. 

Let T be any set which is not fundamental. Add to T all states which are cut 
off from the set B* by 7’. The new set 7” is fundamental and z2(7) and z(7’) are 
the same. On the other hand the 2(7)’s whose T is fundamental have 0 com- 
ponents exactly on 7’, and hence are distinct. Thus the extreme points of U are 
given by the 2(7’)’s with 7’ fundamental. 

Lemma 7. There exists at least one z-simplex of dimension s. 

Proor. Let the index of an interior state be the minimum number of steps 
required to reach a state of B* from it. Reorder the states in such a way that 
their indices are nonincreasing. Then from any state it must be possible to 
reach the boundary without going through a state appearing earlier in the 
sequence. Let 7; be the set of the first 7 states. Then Ty, 71, T2,---, 7, isa 
complete chain with all the 77;’s fundamental. Hence 2(7), 2(71), --- , 2(7:) 
form the corner points of a z-simplex of dimension s. 

Lemma 7 is all that is needed to insure the construction used in Theorem 2. 
Hence the representation theorem applies equally well to the general case. 
The lemma also establishes that even in the degenerate cases U has dimension s. 


5. The set of lower semimartingales U*. The set U* of all lowersemimartingales 
having prescribed boundary conditions is determined by replacing the condition 
(a) Pz 22 

by the condition 
(a’) Pz S z. 


lA 


It is easy to determine the set U* from what we know about U. Each face (of 
dimension j — 1) of U lies ina hyperplane determined by an equality {Pz}; = {2}; 
or {z}, = 0. The latter type faces lie in the coordinate planes. The former nor- 
mally protrude, and they have the martingale z(¢) as maximal corner. U* is 
obtained by taking the set that lies on the other side of the hyperplanes 
{Pz}; = {z},;. This is an s dimensional cone with z(¢) as minimal element. Thus 
U* is the reflection of U through z(¢)—and its linear extension. We also see that 
the martingale is the maximal upper semimartingale and the minimal lower 
semimartingale—the only point U and U* have in common. It is possible to 
represent these lower semimartingales in terms of the 2(7)’s. In fact let 2* be 
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any point in U*. Then a line from z* through z(¢) will intersect a coordinate plane 
in a point z; in U. Then z may be uniquely written in the form 


2* = Ad) + A(z(¢) — a), 


where A is a nonnegative constant. On the other hand, by Theorem 2 
a = Dja2(T;), a; > 0, and > a; = 1. Thus 


2* = 2(¢) + A(z($) — >> a; 2(T))), 


and A and the a,’s are unique. 


We can summarize this by saying that we have a unique representation for 
lower semimartingales: 


k 
z* = >. a; 2(T), 
j=0 
where a; < 0 forj ¥ 0, and >> a; = 1, with¢ = JT) C 7; C --- C T;, forming 
a chain. 


6. Arbitrary boundary values. We have assumed that specific boundary values 
were given. The particular convex polyhedron obtained for U depends on these 
boundary values. However, the extreme points 2(7') are easily obtained from 
Q(T) for any choice of boundary values. In fact 2(J) is the vector with r com- 
ponents given by the boundary values and 0 for all other components. The 
vectors 2(7) are given by 2(T) = Q(T)z(1). The matrix Q(T) does not depend 
upon the boundary values, thus when we find these Q(7T)’s we have essentially 
solved the problem for all possible conditions. 


7. Two examples. We shall give here two examples, one where hypothesis A 
is satisfied and one where it is not. For the first case let P be 


1 0 0 


ee 
p-(2 
1 


The states B = {1, 2} are the boundary states and I = {3, 4} are the interior 
states. Assume that vr; = 2 and v2 = 1. The corner points are given by 


1 
/|3 2{3}) = ), 2({4}) = ') 2({3,4}) = 
\$ 0 


Martingale = z(¢) = 


—— 
weds bo 


The set of all upper semimartingales consists of the set of all vectors 
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7/5, 9/5) 


(0,4/3) 


where (x, y) is a point in the quadrilateral in Fig. 1. There are two maximal 
chains {0}, {3}, {3, 4} and {0}, {4}, {3, 4}. 

The regions above and below the dotted line, indicated by I and II, respec- 
tively, are the corresponding simplexes. The lower semimartingales are given by 
region ITI. 

As an example of a case where we do not get a cubic polyhedron we consider 
the problem of random walk on the line with states 0 and s + 1 absorbing. 
Then the interior states are J = (1, 2, --- , s). We require that v(0) = 0 and 
v(s + 1) = 1. It is clear that many subsets of J are not fundamental. In fact the 
only fundamental sets are the sets ¢ and 7; = {1, 2,---,j} forl Sj Ss. 
Thus U is the s-dimensional simplex with corners 2(¢), 2(7,), 2(T2), --- , 2(T.) 
These corner points are easily found from the ruin probabilities. They have 
coordinates for the interior states given by 


{ . 
0, a 
{2aT;)}i = | me Led | ; 

ee i=3 
Thus any upper semimartingale vector 


{| do 
| 

| @y 
a2 


| ls 
(4s+1) 


with ap = 0, 4,41 = 1, isgiven byz = >> j0¢;2(T;), > t; = 1. In this case it is 
easy to reverse the process and to find the ?’s from the z’s. In fact for given z 


to = (s + l)a, , 


t; ei = Dass 7 2a; + aj), 
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8. Application to sequential games. Consider an absorbing chain with r ab- 
sorbing states and s interior states. Assume that we are given a vector 


Vv} 
V2 


Vr+e/ 


which determines the following game: The player starts in one of the states of 
the chain. If he is at an interior state 7, he may either quit and collect v; , or he 
may move on with the given transition probabilities. If he reaches a boundary 
state 7, he collects v; and the game ends. Let z; be the value of the game to the 
player if he starts in state 7. We wish to find the vector 


1 
Zr+e/ 
This is a special case of a problem considered in [2]. However, we can give a 


more precise description of the solution in the case considered here. It is clear 
that 


(1) z = max [v, Pz}, 


since the player may by quitting or continuing have either of these. We shall 
now find a z having this property and then show that it is unique. Define 


(2) S = inf, (¢ S 9,4 Fa). 


That is, Z is the smallest lower semimartingale greater than v. If Z did not have 
the property (1), then we could obtain a smaller semimartingale greater than v 
by replacing {2}, by max (v; , {PZ},;) in any component 7 for which {Z}; > max 
(v;, {PZ}, Hence Z must have the property (1). 

Assume now that, for some other z, (1) is true. Let T be the set of interior 
states for which {z}, = v,;, then 


P(T)z = z 
and thus 


Q(T)z = lim [P(T)]"z = z. 


noe 


On the other hand, 

P(T)2 S2 
so that 

Q(T)2 & 2. 
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From the interpretation of Q(7) (see Sec. 2), we know that Q(7)z depends 
only on the components of z in Bu T. And this is the set where z = v. Hence, 


Q(T)z = Q(T)». 
Thus 


z = Q(T)z = Q(T)v S$ Q(T): S 2. 


But since z = v and = Pz we see from (2) that z = Z. Therefore, z = Zz. Hence 
2 is the unique vector satisfying (1), and its components are the value of the game 
for various starting positions. 

The optimal strategy is to continue on any component where v; < {2};. 

A similar analysis shows that if the player wishes to minimize his fortune he 
should find the largest upper semimartingale z less than or equal to v and play 
only on states 7 such that {z}; < v; . This latter problem has application to statis- 
tical decision theory (see [2]). 


For the first example given in Sec. 7, let the payoff vector be 
9 
1 
v3 
"% 
Consider the case of the maximizing player. The various possibilities are indi- 


° - Us. . e ° . , 
cated in Fig. 2. If () is in the interior of region IV then z = 2(¢), v3 < z;, and 


U4 


bie v3\. . 
vs < 2. Hence the player should play on each interior state. If ( " is in the 


interior region II or its dotted boundary, then the smallest lower semimartin- 
gale greater than v is the point on the lower boundary of region I vertically above 





b ceosinnicl 2(f) = (7/5, 9/5) 


Il 





SEMIMARTINGALES 153 


v. Thus 2; = v3, 2 > vs. The player should stop on 3 and play on 4 in this 


region. Similarly, in region III he should stop on 4 and play on 3. If : is in 
4 


region I, then z = v and he should not play on any state. 


9. Games with a fee for each play. The results of Sec. 8 can be extended to 
a game in which the player must pay a fee c; if he wishes to continue playing in 
interior state 7. Alternatively, we may think of c, as the cost of carrying out an 
additional experiment. Let c be the column vector which is 0 in B and has com- 
ponents c; in J. Then by an immediate extension of the previous argument, the 
vector z giving the values of the various states satisfies 


(3) z = max(v, Pz — c). 


Let d be the column vector such that d; is the expected cost to reach the boundary 
from state 7. It can be shown that if we take the matrix 9 — P, where J is the 
identity matrix, truncate it to the s X s matrix obtained by eliminating the 
boundary states, and take its inverse, then the ijth entry of the resulting matrix 
gives the expected number of times the process will be in state 7 if it starts in 
state i. (See [3], Chapter VII, Sec. 4.) This matrix multiplied into the truncated 
c-vector gives the truncated d-vector. Remembering that both vectors are 0 in 
B, we see that 


(§ —P)d=ce. 
Hence 
(4) Pd=d-e. 
Since d is a fixed vector, we have from (3) that 
z+d= max(v + d, Pz —c+d) 
and from (4) we see that 
z+d= max(v + d, P(z + d)). 


But this is the problem we solved above. The vector z + d is the least lower 
semimartingale greater than v + d. Thus the value of the game is given by the 
vector z that is found: First we find the least lower semimartingale greater than 
v + d, then we subtract d. 

Thus the game with the cost vector c is strategically equivalent to a costless 
game in which the payoff vector v has added to it the expected cost of reaching 
the boundary. 


10. Application to discrete subharmonic theory. Consider the lattice of points 
in the plane of the form (m, n) where m and n are integers. A random walk in 
the plane is a process which moves from (z, y) to (x + 1, y), (x — 1, y), 
(x, y + 1), (2, y — 1) with equal probabilities. 

Let B and J be finite sets of lattice points such that from any point of J a 
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random walk can reach a point of B but cannot reach any point not in B u J 
without going through B. Then B is called a boundary set and / an interior set. 
Consider a boundary set B and interior set 7. Assume that boundary values 
v(j, &) are given on B. Then there is a unique lattice function f defined on B u J 
having the property that 
FG, &) = 1/4fG + 1,&) + 1/4fG,k + 1) + 1/4fG — 1,4) 
+ 1/4f(J, k— 1), (J, k) € I, 


and 


Sj, k) = 0G, k), (j,k) e B. 


This function provides the discrete analogue for the solution of the Dirichlet 
problem; the function f is a discrete harmonic function. One should ask the 
corresponding problem for discrete subharmonic functions. That is, a function 
f is a discrete subharmonic function with prescribed boundary values if 


fj, k) = 1/4fG + 1, &) + 1/4fQ, & + 1) + 1/4fG -— 1, &) + 1/4fG7, k -— 1), 
(7, k) eT, 
fg, k) = 0, k), (j,k) e B. 


In this case the solution would not be unique. 

The random walk in J u B forms an absorbing Markov chain. Assume that 
the boundary values are nonnegative. The harmonic function f is given by the 
vector 2(@). The subharmonic functions are the semimartingale vectors. Thus 
the set of all subharmonic functions forms a convex polyhedron and each such 
function may be represented in terms of a finite number of basic semimartingales. 
Each basic semimartingale z(7’) is simply the unique solution for the Dirichlet 
problem for boundary B u T with the given values on B and 0 on T. Thus the 
set of all subharmonic functions for a given boundary B may be represented as 
convex combinations of the harmonic functions for the boundary sets B u T. 

We have reason to believe that these results obtained for discrete subharmonic 
functions will lead to analogous results for ordinary subharmonic functions. 
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NOTE ON SUFFICIENT STATISTICS AND TWO-STAGE 
PROCEDURES 


By S. G. Guurye 
University of Chicago 
0. Introduction. This note is the result of an attempt to discover problems in 
which one can apply the two-stage procedure used by Stein [1] for tests regarding 
the mean of a normal population. One such problem, that of testing for a loca- 
tion parameter of an exponential population, was found to be easily soluble along 
the lines of Stein’s work. An investigation of the problem of optimum statistics 
for such procedures was also undertaken, and partial solutions, given in Sec. 2, 


were found. In this connection, the author would like to thank the referee for 
his useful comments. 


1. Testing for a location parameter of a distribution. Throughout this paper 
F(x) will be a one-dimensional c.d.f. with at least two points of in- 
crease. Further, {Y,} will always denote a sequence of independent random 
variables having a common c.d.f. F(x) and {X,} will denote a family of se- 
quences of independent random variables, all elements of any one sequence 
having a common c.df. F[(x — 6)/c], —-2~ < @<2,o¢ > 0. Weshall be dealing 
with statistics or sequences of real and single-valued functions f(n; 4%, --- , 2a) 
and s(n; 11,°-:,2,) of n real variables, n = 1, 2, --- , about which one or 
more of the following assumptions will be made as required: 

AssumpTIoN I. For any integer n > 0, any a > 0, any real b and any 


(m1, °°, an) €R’, 
(1) t(n; ax; + b, +--+, ax, + b) = al(n;2,-+-,4%n) + BO. 
AssumPTION II. Analogously, 
(2) s(n; az; + b,--- , art, + 5) = as(n; 2%, °--* , In). 


Assumption III. There exists a positive, nondecreasing and unbounded 
sequence k(n) such that 


(3) Pr {é(n; Yi, ---, Yn) S x/K(n)} = G(z) 


is independent of n. Without loss of generality, we may assume k(1) = 1. 
AssumpTION IV. The random variables t(n; Y;,--- , Yn) and s(n; Y1,---,¥)s 
are stochastically independent. 
AssumpTION V. There exists a positive integer m, such that for any n > m, 
t(n; 2, --* , Zn) isa function only of m, n, t(m; 2, +--+ , 2m) ANd Tear, *** y Zee 
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AssumPTION VI. Let 


(4) Pr {s(n; Y,,---, Y.) S x} = H(z; Nn). 
Then H(0; n) = 0 for all n. 
Now, let ® be a population whose c.d.f. is known to be F[(x — @)/c], but 


6, o are unknown, and suppose that it is desired to obtain a test of Hy:@ = 4 
against the alternative 6 > 4 with the following properties: 

(a) The size of the test is to be a prescribed probability a for all ¢ > 0; 

(b) The power of the test for 6 = 6 + 6, where 6 > 0 is a given number, 
must be not less than a prescribed probability 6 > a for all « > 0; 

(c) The power > las 0— ~. 

It is easy to see that, if Assumptions I-VI are satisfied, the following pro- 
cedure, which is essentially that given by Stein, has the required properties: 

Choose an m satisfying Assumption V, and let 


(5) x = infimum of all y such that / G(yu) dH(u; m) = 1 — a; 


(T r= ; " 
G(xu) dH(u;m)—- 1+ a {G(xu) — G(xu — 0)} dH(u; m) 
LL. 
| ] oo 
(6) y¥ =4 


if the denominator > 0, 


} 
\0 otherwise; 


2 


(7) Piy=1- [ G(yu) dH(u; m) + v| {G(yu) — G(yu — 0)} dH(u; m); 


(8) x’ = supremum of all y such that P(y) = 8; 
(9) p = (x — x’)/6 > 0. 

Take m independent observations X,,--- , X» from @, and calculate 

Se = Ss A, *** 5 am). 

Let N be such that 
(10) k(N — 1) < psm S k(N), 
except if ps, < k(m), in which case N = m. 

If N > m, take N — m more independent observations Xm, °-- , X» from 
®, calculate 
(11) U = {UN;Xi,--+ ,Xw) — O}k(N)/Sm, 


and reject Ho with probability ¢(U), where 


{o uU<X;, 
(12) dtu)=17, Uw=x, 
(1, u> xX. 








SUFFICIENT STATISTICS 


The expected sample-size is 


E(N) = mPr {ps < k(m)} + : (m +r) Pr {k(m+r — 1) 
r= 


(13) < p8&, <= k(m+ r)} 


m+ > A{k(r)o'p"; m}, 


where A(z) =: 1 — H(z). 


For computational convenience, we have the inequalities 
(14) y< E(N) <rvte 


where 


(15) mH {k(m)o™'p*; m} + | k(epu) dH(u; m), 
) 


“apu>dk(m 
(16) e = H{k(m)o'p'; m}, 


and k'(u) is any monotone function of u > 0 such that k{k™'(n)} = n for 
every integer n > 0. 

It may be noted that, if in Assumption III we drop the restriction k(1) = 1, 
then ck(n), with c > 0, serves instead of k(n) (with a different G for each c). 
However from (10) it is easy to see that N is independent of this c. Thus the 
restriction k(1) = 1 does not cause any loss of generality. In the same way, it 
can be seen that if we substitute cs(n; a, --- , Zn) for s(n; m1, --- ,2a),c¢ > 0, 
N is unaffected. 

Examples in which / and s satisfying the assumptions can be found are pro- 
vided by the normal distribution, which was discussed in detail by Stein, and 
the exponential distribution which we shall take up here. 

Of the several possible choices for (¢, s) in the normal case, Stein considered 
two, in both of which s’ is the usual estimate of o*. By using a special linear 
function for ¢, he was able to obtain a test whose power is independent of ¢ 
instead of merely being bounded below by a function independent of ¢ as re- 
quired in property (b) above. However, he noted that this procedure “wastes 
information,” and advocated one using the sample mean as ¢. In fact, the use 
of any statistic other than the sample mean is wasteful in the sense that it leads 
to a higher expected sample-size, as we shall see in Sec. 2. 

Next, let @ be a population with c.d.f. F[(z — @)/c], where 


(0 if «<0, 


(17) F(z) = 
l—e- if z> 0. 


Let 2m) , -** , Zt) denote the rearrangement of numbers 2, --- , 2, in as- 
cending order of magnitudes, and let 


(18) to(n; a aed » Xn) = a = min (u, 7 or » Za). 
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This is a sufficient estimator of @, and we shall see in the next section that it 
is the best for use as “‘?’’. If corresponding to independent observations X; , 
--- |X, from ®, we put Z; = Xp), and 


c= Xt — Xp-y, 


the joint density of Z;,--- , Z, is 


, 


into “exp {—n(a — 0) — (n — 1l)zzs — --- 
(19) fler, +++, 2) =4 if 212 0,220,% =2,: 
lo otherwise. 


Any function satisfying Assumption II is a function only of the differences of 
the arguments, and hence only of Z,, --- , Z, . It is thus independent of Z; and 
can be used as “‘s’’ if it is positive with probability 1. This is true more gener- 
ally, as shown by Lemma 2 of the next section. It will be seen (Example 2) 
that, asymptotically as o — ~, the best statistic to use as s is 


(20) So(n; %1, °** an) = Da; — nb(n;m, --+, 2a), 
1 


which together with f is sufficient for c. 
For this pair of statistics, we have from (19) 


(21) k(n) = n, G(x) = F(z), 


[ re * dx/(m — 2)!, u>Qd 
H(u; m) = 4% 


= 0; 


dp 


du/(m — 2)!+ {(m — 1)/e} | ue du/(m — 1)! 
(24) [ 
| u™*e™ du/(m — 2)!, 
where c = (ap). 
The values of v and e were calculated for a = 0.05 = 1 — Banda = 0.01 = 
1 — 8 and several values of m and 6/c. These are given in Table 1. 


2. Optimum choice of statistics. We shall now prove three preliminary lemmas 
which enable us to show that if a suitable sufficient estimator of @ exists, it mini- 
mizes the expected sample size among all ¢ satisfying the assumptions. 

Lemma 1. Let Y be a real-valued, one-dimensional random variable, and f(y) a 
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TABLE 1 
Expected sample size 


(The entries are given in the form » + ¢ and imply that »y < E(N) < v + ) 





8/e 
” 0.05 0.10 0.20 0.40 0.00 0.80 | 1.00 
2 379.0 189.4 94.8 | 47.4 | 31.6 23.8 | 19.1 
| +1.0 | +1.0 | +1.0 | +1.0 | 40.9 | +0.9 | 40.9 
1 101.8 50.9 25.5 | 12.8 8.7 | 64 | 5.7 
|} +1.0 | +1.0 | +1.0 | 40.9 | +0.8 | 40.7 | 40.6 
6 | 81.0 | 40.5 | 20.3 100.4|/ 761] 6.5 6.2 
a= .05 +1.0 | +1.0 | +1.0 | 40.8 | 40.5 | 40.3 | 40.1 
8 | 73.6 | 369 | 18.5 | 100] 83 | 8.1 8.0 
|} +1.0 | +1.0 | +1.0 | +0.7 | +0.2 | 40.0 | +0.0 
| 10 | 70.0 | 35.0 | 17.6 10.7 | 10.0 10.0 | 10.0 
| +1.0 | +1.0 | 40.9 | 40.3 | 40.0 | 40.0 | 40.0 
| 20 63.9 | 32.3 | 20.3 20.0 | 20.0 20.0 | 20.0 
+1.0 +1.0 +0.1 | +0.0 +0.0 | +0.0 +0.0 
2 | 1980.0 | 990.0 | 495.0 | 247.5 | 165.5 | 123.8 | 99.0 
+1.0 | 41.0 | 41.0 +1.0 | +1.0 +1.0 | 41.0 
4 | 218.3 | 109.1 54.6 | 27.3 | 18.2 13.7 | 11.0 
+1.0 +1.0 | 41.0 | 41.0 | 41.0 | 40.9 | 40.9 
6 151.0 75.5 37.8 8.9 | 126 | 98 | 8.2 
a= .0] +1.0 +1.0 +1.0 | +0.9 | +0.9 +0.8 | +0.7 
8 130.1 65.0 32.6 16.3 11.3 9.3 8.2 
+1.0 | +1.0 | +1.0 | 40.9 | 40.7 | 40.5 | +0.2 
10 120.6 60.3 30.0 15.3 | 11.3 | 10.3 | 10.0 
+1.0 | 41.0 | 41.0 | 40.8 | 40.5 | 40.2 | 40.0 
20 104.0 52.0 26.8 20.0 20.0 | 20.0 | 20.0 
0 


+1.0 +1.0 +0.8 +0.0 +0.0 | +0.0 | +0. 


measurable, real-valued function with the property that for any real x, 6 and any 
a > OQ, 


(25) Prif(oY + 6) — 6S or} = Pr{j(Y) S zr}. 


Then if f(y) is strictly monotone, there exists an interval I, open or closed, such that 
yell = f(y) = y, and Pr{Y¥ ef} = 1. 

Proor. To start with, we note that since the right hand member of (25) is a 
nondecreasing function of .r, f(y) cannot be a decreasing function; for if it were, 
we would have 


Prif(Y) S x} = Prif(Y + 0) — 6S xz} = Prif(Y) — 0s 2} 
for all 6 > 0. 


This implies that the e.d.f. of Y is constant and hence contradicts the assump- 
tion that F(a) has at least two points of increase. 
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Therefore f(y) is increasing, and 
tuf(y) Su} = fyy Sf}. 


Let h(y) = f(y) — y; then from (25) we obtain 


( 
(26) Pry¥ < a+ Mex + =Pr{¥ <x+Ah(2)}, 
o 
(27) Pr{y =2+ es +0) = Pr {¥ = 2+ A(x)}. 
og 


Suppose there exists a yo such that h(yo) ¥ 0. From (26), we get the relation 
o 


( 
Pr{¥ < te + mw) = Pr (y < yot+ h(4o)}. 


By letting o — 0 and again o — ©, we see that 


{h(yo) > O implies Pr{Y¥ S yo} 


l, 
(28) 


(h(yo) <0 implies Pr{Y¥ < yo} = 0. 
Hence, if there exist zo , yo such that h(x) < 0 and h(yo) > 0, we must have 
ro < yo. Consequently, there exist points x2», yo (which may be respectively 
—« and «) such that 


| [A(ro) $0, h(y) 20, Aly) =0 for m<y¥ <y, 
(29) 1 
{ and Pr{zm S Y S yo} = 1. 


Finally, if h(yo) > 0 we see from (27), by choosing o = 1 and @ such that 
To — Yo <eex«cc 0, that 


Pr{Y = yo} = Pr{Y = yo + h(ye)} = 0 


from (29), and similarly, h(a) < 0 implies Pr{X = 2x9} = 0. Hence the result. 
Next we want to consider two statistics, one of which is sufficient for 6 and 
both of which have @ as a location parameter. More specifically we prove 
LemMa 2. Let P(-; 6), —x <0 < x, bea family of probability measures on a 
countably additive class of subsets of a set Q of points w; let f(w), g(w) be measurable 
real valued functions on Q such that for any Borel sets S, T on the real line, 


(30) tf '(S + ong (T + 0); 8} = Pif (8) ag "(T); 0}. 


If f(w) is a sufficient statistic for the family P(-; 6), the random variables 
g(w) — f(w) and f(w) are stochastically independent. 
Proor. Writing Pf '(S) to denote P{f-'(S) 1 @;0}, we have from (30) 


(31) Pif(S) nQ; 6) = Pf'(S — 6). 


By the Radon-Nikodym Theorem and the sufficiency of f(w), we know that 
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corresponding to each set 7, there exists an integrable function A(T | z) on the 
real line such that 


PISS) ng (1); 0} = [AT |2) dPP Ce — 0). 


This gives us 
Pif(S) ng“(T); 0} = Pif"(S + 0) ng"(T + 6); 6} 
= [ M7 + 0\2 + 6) aPF*(2). 
8 
It follows that for every set 7, we have 


P{f(S) ng'(T); 0} 


[ MT — 2|0) aPf(z) 


[ u(T — z) dPf"(2), 
so that 
Prig(w) ¢ T | flw) = z} 


and consequently 


u(T — zx) for ae. x [Pf], 


Prig(w) — fw) ¢ T | fw) = z} = (7) 


is independent of z. 

Corouuary 2.1. Let t;(n; 21, +++ ,2n),j = 0, 1, be functions satisfying Assump- 
tion I and let to(n; 21 , --- , Xn) be a sufficient statistic for the family of distributions 
Bites F(x; — 0), —-2~ <6 < ~. Then for any n, if X,,--- , X, are independent 
random variables having the common c.d.f F[(x — @)/«], the random variables 
to(n; X,, -+- , Xn) and t,(n; X,,--+- , Xn) — to(n; X1, --- , Xa) are independent. 


Corouuary 2.2. If to(n; 21, --- , Xn) 18 as in Corollary 2.1 and s(n; 2%, --+ , Xn) 
is any function satisfying Assumption II, the random variables to(n; X1, --- , Xn) 


and s(n; X,,--- , X,) are independent. 
Lemma 3). Let to, 4, X1, °°: , Xn be as in Corollary 2.1 and suppose that to , 
t; also satisfy Assumption III with respective sequences ko(n), ki(n). Then 


(32) ko(n) = k(n), 
the equality holding if and only if 
(33) Pr{io(n; X1,°°- , Xn) = h(n; X1,--- ,Xe)} = 1. 


Further, for any a, b such that a < 0 < B, 
Pria < to(n; X,,---, Xn) — 6 < DB} 
= Pria < a(n; X1,---, Xa) — 6 < B}. 


1 It may be of interest to compare this with the results of Pitman [3], pp. 401-402. 
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Proor. From Assumptions I and III, it follows that 


(35) Pr {t(n; Xi, ---,Xs) $2} = r{e— == kod), 


Let f(z) = fe dF(x), cn = ko(n), in) and 


Es ta[ty(n; ¥y,---. Yn) — to(n; Yu.---, Yad) 
te 


g{z/ko(n)} = Efe 
Then from Corollary 2.1 and (35), we get 
(36) f(enz) = f(z)g(z). 


First suppose f(zo) = 0 for some z). Then f(chzo) = 0 for every r, and hence 
cn & 1. On the other hand, if f(z) + 0 for all z, g(z) = f(enz)/f(z) is a charac- 
teristic function. Hence f(cz)/f(z) = g(z)g(enz) --- g(cn 2), r = 1, 2, 
is a sequence of characteristic functions. If c, < 1, lim,.« f(enz)/f(z) = 1/f(z 
for every finite 2, and since the limit is continuous in 2z, it is a characteristic func- 
tion. But f(z) is also a characteristic function. This implies |f(z)| = 1 for all z, 
which is impossible if F has more than one point of increase. Consequently, 
Cn = 1, that is to say (32) holds. 

It follows from (36) that c, = 1 if and only if g(z) = 1, in which case (33) 
follows on account of Assumption I. 

Finally, (34) is an immediate consequence of (32) and (35). 

THEOREM 1. Let t,; and s be statistics satisfying Assumptions I-V1, and let to 
be a statistic, satisfying Assumptions I, III and V, which is sufficient for the family 
of distributions Ij F(x; — 0), —»~ <@9< x. Let N; denote the sample-size in 
the two-stage procedure using (t; , 8), 7 = 0, 1, with the same m, a, 8, and 6. Then 


(37) E(No) S E(N,) for all a, 
the equality holding for all o if and only if 
Pr{io(n; X,,---, Xn) = h(n; X1,---, Xn)} = 1 foralln 2 m. 


Proor. The hypotheses of the theorem and Corollary 2.2 enable us to use (ép , s) 
for the procedure described in the previous section. With the notation used there 
we have from (30), G(r) = F(x), i = 0, 1, and from (9), 


(38) po = pi. 
From (13), we get 


E(N)) = m+ LAK (ro ps’; mj, 
and using (32) and (38) the result follows. 

Remark. Theorem 1 solves part of the problem of optimization of the two- 
sample procedure by showing that if a suitable sufficient estimator of @ exists, 
it is the best ‘‘?” to use. This leaves us with the problem of choosing “‘s”. We shall 
see that in the case of the normal and exponential distributions the best pair 
(t, s) to use, asymptotically as ¢o — ~, is the pair of sufficient statistics (to , 80). 
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Lemma 4” Let so(n; 21, --- , tn) and s(n; 21, --- , tq) be statistics satisfying 
Assumptions II and VI, and let to be a statistic satisfying Assumption I. Let (to , 80) 
be sufficient for the family My F[{(x; — 6)/o], —-~7 <@< «2, ¢ > 0. Then 
io(n; X1,--+ , Xn), So(m; X1,---, Xn), and s(n; X,,--- , X,)/so(n; X1,---, 
X,,) are mutually independent. 

Proor. This result can be proved formally along the lines used for Lemma 2, 
but this seems hardly necessary, and only an outline in terms of conditional 
probabilities will be given. 

Let u(n; 21, °++, tn) = 8(m; %1,°-+ , %n)/8o(n; 71, +++, Za), and note that 
u is invariant under the transformation x; — ox; + @,7 = 1,---,n. 

For almost all a and b, 

yoy Petwins Xay +++ Na) € S| fol; Xi, ++, Xa) = a, 

(39 


is independent of (@, ) and equals 
Priu(n; Y; a Fa) gs S fo(n: Y; _ eee ya =a, 
(40) 
s(n; Y,,---, Ya) = 5}, 


using notation indicated at the beginning of Sec. 1. On the other hand, from the 
hypotheses of the lemma, (39) also equals 

( ; ‘ . F 5 a—@ 

Pr<u(n; ¥1,---, ¥aleS|to(n; Yi, ---, ¥.) = ; 

(41) . 

s(n; Yi, --- ’ Y,) =-? 

o 


From the equality of (40) and (41), it follows that the conditional distribution of 
u is independent of the conditioning values of f and s  , so that u is stochasti- 
cally independent of (¢ , so). But ¢ and so are mutually independent, since Corol- 
lary 2.2 applies. Hence the result. 

This lemma can be used to compare the relative merits of s) and any other s 
asymptotically as o — «. Let us assume that the hypotheses of Lemma 4 are 
satisfied, that F is continuous and that é also satisfies Assumptions III and V. 
Then we know that fo is the best statistic to use as “‘?’’, and both sp and s are eli- 


gible as the “s’’ statistic. Let 
(42) J(u) = Pri{s(m; X,,--- , Xm) S uso(m; X,,--- , Xm)}, 


and H(u), Ho(u) denote the c.d.f.’s of s and so respectively. It will be understood 
that we have the same m throughout the discussion. We already know 


(43) Pr{io(n; X1,---, Xn) S$ 0+ or} = Fi{xk(n)}. 
?My attention has been drawn to the fact that a general result of the type of those 


given in Lemmas 2 and 4 has been previously given, for boundedly complete sufficient 
statistics, by D. Basu [Sankhya, vol. 15 (1955), pp. 377-380.] 
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Let 

(44) Mi) = [Fy atta). 
Then 

(45) H() = [° Holo/w) aw) 


by Lemma 4, and we get 


(46) [ F(yu) dH (u) =| M(yu) dJ(u). 
“0 0 


From (5), (8), and (9) on account of continuity, we have 


(47) p = (x — x’)/6, po = (xo — xo)/6, 
and 
| M(x) =1—a= [ M(xu) dJ(u), 
“0 
(48) es 
| M(x) =1-—8 = | M(x’'u) dJ(w). 
Yo 
Hence, 


. , =| ( 3 | f “ , | 
(49) (xo — xo) = M4 [ M (xu) d J(u) > —M [ M(x'u) dJ(u)y. 
“0 } \0 / 
Now, from (14) and (15) we know that E(N), E(No) —~ ~ as o — ~, and 


E(N>) [ ko (opou) dHo(u) 
ULLVO 





~w °9 
E(N) ~ f° Ngee 
) k7(apu) dH(u) 
“0 
Suppose 
(50) k(u) = u'°, 


where c is a constant 2 1. (This is the case in the normal and exponential popula- 
tions.) Then 


E(No) oi | u’ dH(u) _ of 
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Therefore 


rf 


2 l/e 
os | ui’ dJ(u)> > po 
\ Jo 


implies that, asymptotically as ¢ > ~, E(N») < E(N). However, since 


f 


awe 


© L/e 
< | uodJ(u)> > | u dJ(u), 
Jo Jo 


if we can show that 
= 
(52) (x — x) [ udJ(u) > x0 — x0, 
Jo 


this implies that asymptotically (f, 8) is the best, or minimum-expected- 
sample-size, pair among those satisfying the initial assumptions. 

We shall now prove (52) to hold in the two cases that matter. As previously 
noted, s> and as), where a is a constant >0O, are equivalent statistics for our 
purpose, and hence in what follows we shall only consider as alternative candi- 
dates, statistics s which are not constant multiples of so ; in other words, we 
assume that J(u) has at least two points of increase. 

Exampte 1. Let F(x) = fi.e ” du/+/2x, and assume a < 05 < 8. 

M(y) is Student’s distribution, so that M(0) = 0.5. Hence 


(53) xo <O<x0 and x’ <0 <x. 


Further, M(y) is concave or convex according as y > 0 or <0, and therefore 
M™ ff M(yu) dJ(u) S yf; udJ(u) according as y 2 0. From (53) and (49), 
(52) follows. 

EXaMPLe 2. Let F(x) be given by (17). Then 


(0 if y <0 


M(y) = ¢ where u = m— 12 1. 


! 


ili—(l+y)" @ y> 


Consequently, all x’s are positive. Now let 


fly) v | udJ(u) — M™ [ M (yu) dJ(u) 
0 0 


(4) ; 


y [warm —{[ a+ yy aru) = 1. 
0 \ 40 
Then 


f(y) = I dJ(u) — | u(i + yu) *) dJ(u) 
0 0 


a® \ —GWt+l) /p 


“4 | (1 + yu)“ dJ(u)> 
0 


\ / 
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It is easily seen that f’(y) > O for y > 0; because the sign of f’(y) is the same 
as that of 


\ Gatl) /e 2 eo 
\ —y-1 \ 
dJ (wu) > [ udJ(u) — [ ull + yu)” dJ(u 
/0 Jo 
which is 


> | (1 + yu) © dJ(w 2s | (1 + yu) ad > | ud J(u) 


= | ull + yu) ' 1 ddI (u 


. 


> | (1 + yu) * dJ0o) | u(1 + yu) 'dJ(u) — | wl + yu) dJ(u) 


> ©, 


since wu and (1 + yu are monotone in opposite directions for y > 0, and the 
same is true of u(l + yu) and (1 + yu)™. Consequently, f(y) is an increasing 
function of y > 0; (52) follows from 


(49), and asymptotically as ¢ — ~, (fo, 
so) is the best pair of statistics to use. 
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SAMPLING VARIANCES OF ESTIMATES OF COMPONENTS 
OF VARIANCE 


By S. R. SEARLE 
New Zealand Dairy Board, Wellington, N. Z.' 


1. Outline. In earlier work (4) matrix methods have been developed for ob- 

taining the sampling variances of estimates of components of variance. These 
rely on the fact that if y = x’Fx is a function of variables x, having a multi- 
normal distribution with variance-covariance matrix V, then the variance of y 
is given by 
(1) var (y 2 tr (VF). 
The use of the method was demonstrated by obtaining for the case of a l-way 
classification with unequal numbers in the sub-classes, the sampling variances 
of the estimates of variance components, as summarized in (1); it was then ex- 
tended to the sampling variances of estimates of components of covariance. 

The present paper makes further use of this matrix technique to obtain the 
sampling variances of estimates of components of variance from data in a 2- 
way classification having unequal sub-class numbers. The model assumed is 
isenhart’s Model II, [2], and the method of estimating the components is 
taken to be Henderson’s Method 1, [3}. 


2. Model and analysis of variance. The observations «,;, are taken as having 
the linear model 


u+ A; + B; + (AB) + G2, 


1 


with / | Nij,t = 1---a, andj 1---b. wis a general mean, A; and 


AB),; is an interaction and ¢,;, is residual error. Under the 


B; are main effects, 
assumptions of the model, all terms (except uw) are taken as being normally 
distributed, with zero means, and variances o; , os . 02, , and of , which we will 
write as a, 8, y and e¢ respectively. 

For a sample of N observations in N’ cells of this 2-way classification an 
analysis of variance can be written as 


Between A classes 
Between B classes 
Interaction A X B 
Residual 


Total 





where the 7’s are uncorrected sums of squares. With n, = 0, n,;, and n., 
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> nr, , and using customary notation for means, 


_ -2 ryY =2 
i > n, a I’, : Don. ; 8. 


Te = >, Dnt. , Ty = NB and Ty) = ), > Dy Zin - 
ij i k 
We may note in passing that not all the expressions in the “sums of squares” 
column are in fact sums of squares, notably the interaction term. It would be 
more correct to jabel this column “quadratic forms” but the terminology 
“sums of squares’’ has historical precedence and will be retained. 

Henderson’s first method [3] for estimating the components of variance is to 
equate each of the first four lines in the above analysis to its expected value. 
Denoting the resulting estimates of a, 8, y, and € as 4, 8, 4, and é, the equations 
for obtaining them are 


T =S = (N — hat (ke - ko) 
+ (ki — ks)¥ + (2 — Lye! 
(ko — kya + (N — ke )8 
+ (ky — ka)¥ + (6 — Le? 
(ky — kaa + (ke — ky)B | 
+ (N — ky — ker + ks)¥ 
+ (N’ —a— 6+ 1} J 


(3) To S, = (N — N’)é 
where the k’s are functions of the n,;’s, namely 


de nis dn, 
3 t 
= - ci ’ 


ky = > Kay z=. za 


N;. j n. 3 
] 2 l Bs l ; 
ky = N Zz Ni. s ke aes v 7 Nj y and ks = N p> Nij- 


3. Variances required. In the analysis of variance S, has a x*-distribution 
with N — N’ degrees of freedom. Hence, from Eq. (3) the variance of € is 


Using (3), Eqs. (2) give &, 8, and 4 as linear functions of S,, S,, Saw, and é. 


But S, , and hence é, is distributed independently of S., S,, and S.. Hence 
the variances and covariances of &, 6, and 7 can be obtained as linear func- 


tions of o; and the variances and covariances of S,, S,, and Sa. By the nature 


of the S’s it is easier to consider the variances and covariances of T, , T,, Tas , 
and T,. 
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Writing P for the matrix of coefficients of &, 8. : * in Eqs. (2) these equa- 
tions can be written as 
Pe 
] 0 0 —-1 ee | 
0 1 o -1-° P\é)+eé iw 
ai ad i 
-—-1 -1 1 1 a N’ —-a-—-b+1 
Wa 
which we may write as 


Ht Py 


™=! 


Since € is independent of the terms in Ht, the variance-covariance matrix of 
a&, 3, and 7, var (vv’), can be expressed in terms of the variance-covariance 
matrix of the 7’s, var (tt’), as 
(4) var (vv’) = P [H var (tt’)H’ + mm’o;|P 
and 

cov (é¢v) = —P'mo;. 
The unknown term in these expressions is var (tt’) the variance-covariance 
matrix of T,, Ty, Ta, , and T, , which we now proceed to obtain, term by term. 


4. Matrix definitions and expressions. Let U be a matrix having a one for 
every element, its order being denoted by subscripts, thus:-— 


U-matrix Order 
(all elements 1) 
rT: 


l 
r 
l 


‘ 


Define W-matrices in terms of the Us: 


Ecos - Ete 
-—[ ;, and Wy = = Ux. 


nj 


Then C-matrices are defined, of order N xX N, whose only non-zero sub- 
matrices are W’s along the diagonal: 


C, has W..t 1--- a) in the diagonal, 
C, has W.ijg=1---b in the diagonal, 
C» has W,,(i=1---a,. --+ Bb), — in the diagonal. 
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Finally we define D-matrices, the same as C-matrices only having U-matrices 
instead of W-matrices in their diagonals. 

Let x’ be the row vector of the N x;,’s, arrayed in order, k = 1--- nj;, 
within j-classes, within each 7-class; i.e 


x’ = (tm--: Uyiny, Ui *** Uieny, °° * Labi *** Sines). 
Then if w’ is the vector of the 2’s arrayed in k-order within 7-classes within each 
J-class, w’ will be a transform of x’, w’ = x’R’, say, where R is an orthogonal 
elementary operational matrix of order N, of identity matrices J. 
The T’s can now be expressed in terms of these vectors and matrices: 

7 , 

to = £Ca, 

T, = w'Cww = x’R’'C,Rx = x'Bx, say. 

le = x’C ux == w’C..W, 

T, = v'U mx. 
In Cw the W’,;; in the diagonal are in j-order within 7-order; in Cy, they are in 
i-order within j-order. 


V, the variance-covariance matrix of the z,,,’s appropriate to x’ can be written 
as 


V=J+K, 
where 
J = aD, + (8+ 7)Da + ed, 
and 
O Ky Kis--- Ki 
xi" * 
Ka 0 
with Ay , 7 # 7, 7,7’ = 1--- a, of order n;, X ny. has all elements zero ex- 
cept those in b rectangular matrices BU;;,;, 7 = 1.--- b. These b matrices lie 
“ce 


corner to corner’’ across K;, thus: 


[([——— 





1] j 


| ——-- 
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For example the V matrix for a sample of 7 observations with ny = 
Ny = 3 and n» = 1 would be 


fatBt+y+e 


a 


a 


This is the variance-covariance matrix appropriate to x: that for w will be 


RVR’. 


5. Variances and covariances. 7,, 7), 7, and J; have now been expressed 
in the form x’Fx, and the variance-covariance matrix has also been obtained 
The sampling variances of the 7’s will be found from (1), by evaluating 
2 trace VF) for each of them. 

6.1. var (T,) = 2 tr (VC,). 
VC, can be expressed as 
" P14 \ 


( ] C, 


where P;, is a column of matrices 2;;U 


is a column of matrices w 
Wee; = (Nz; 


VC, has here been partitioned into P-matrices, which themselves have been 
partitioned into sub-matrices of the U-type. Trace (1C,)° will therefore depend 


on two propert i = of the se [ matrices, that 
(5) 


and 





eS 
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Using these results we have 


tr (VC,) = 7 a tr (Pix Py) 


= i tr (Pix) + Zz Pq tr (Pix Ps:) 


t 6’ stt 
= a , 
= LV Ling Vngty t+ LL Li ng wes Le nv, wy. 
1 j 3 i #si 3 ) 
On substituting for z;; and w,; this gives 


4 var (T,) = a [do ny(ni.a + nyB + niyy + ©)/ni.)° 
? 


. n sny;) 
+>) > +—- 8’. 


Tee ON, 
5.2. var (Ta) = 2tr (VC). 
V and C.. are such that their product can be written as 
VCo = L + K, 
where K is as in V, and 


L = aD, + (8 + y)Da + Ca. 
Hence, 


(7) VC» = V + (Ca — 1). 
Since V and Cy are symmetric, VC. is also, and hence squaring (7) gives 
(VCu) = V'+ é(Ca — I). 
Hence, 
1 var (Tw) = tr V? + é(trCa — tr/) 


=PTdradle+s8+y+ 6 + (ny — De +8 +) 


+ (n;. — nije’ + (n; — ne) + é |= D ns sj 2 ~ v], 
a 3 nN; 


i? 
which reduces to 
1 var (Tu) = D> ny (nila +B +7 + €/ny)? + (ny. — nyo? + (ny — ni) 6'). 
5.3. var (7';) = 2 tr (VWy)’. 


Similar to the form of the ?-matrices in 5.2, VW» can be expressed as a column 
of matrices y;;Uij.~, (@ = 1--- a,j = 1--- 6), where 


Yj = (nya + n.j;8 + nivy + €) N. 
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Hence, 
tr (VWs)? = > Dd nijys DD sys, 
i j i ? 
giving 
+ var (T;) = (> 7; ni (nia +n;8B + ny + ef NJ’. 
' } 
5.4. In general, for any two square matrices of the same order, A and B 
it can be shown that tr (A + B)’ tr A’ + trB’ + 2 trAB. If then, a 


¢ » ° . . A 2 
h are two function of the same set of variables such that var (4) = 2 tr A’, 
var (b 2 tr B’, it follows at once that 


8) cov (4b) = 2 tr AB = 2 tr BA. 

This result will be used for obtaining the covariances among T,, T,, Ta, 

r 

5.5. cov (7. , Ta 2 (VEVCa). 

In 5.1 VC. has been partitioned into P,,’s and P,,.’s. If VCaw, expressed as 


lL. + K in 5.2 is partitioned in the same manner, into L,,’s and K,,’s, then 


1 cov (T., Tw) = >. > (inner product of /’th row of P;; and / th column of Lj) 


i i=l 


+> ¥& ®& (inner product of I’th row of P; and lth column of Ky) 


t t’1 l=) 
and after substitution this reduces to 
4 cov (T., Ts) = > 2 ni(nia + nyB + nyjy + &/n. 
J 
+8 DOD nin — nj)/n.. 
. 2 

5.6. } cov (T,, T;) = tr (VWy)(VC,). 

Using 5.1 and 5.4, and Eq. (6), this can be expressed as 


cov (Te, Ty) = Do DY ng yy (Do ng zg + De ao Ny; Wij). 
2 2 


‘ a 0’ ptt 


which on substitution for the z’s, y’s and w’s, reduces to 
<a 


3 cov (Ts, T;) = Le (nia + n,8 + nyy + ©) 
‘ J - 


Dann =n 
nia+ 8 Fincmmecrens +7 


‘. t. 


+e 


3 cov (Ts, Ts) = tr (VWy)(VCo) 
= >> ngys [XS terms in 7j’th column of V + €(Ca — 1D] 
. @ 


- a > nina +58 + njyy + 6’/N. 
. 3 
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5.8. In Eq. (8) it is required that 4 and b be functions of the same set of vari- 
ables; therefore, in terms of paragraph 4, the covariance of T, and 7; must be 
expressed as 


cov (Ta, Ts) = 2 tr (VC.)(VB). 


This covariance is a little more cumbersome to evaluate than previous ones; 
the method used is essentially a generalization of earlier paragraphs. 

B is the same form as V, (4.5) but with matrices (1/n_,)U,; in the diagonal 
j= 1---b,forz = 1--- a, and with K;,--matrices having terms 


(] n.j)U;,; 


Now partition V into matrices (V),;,: of order ni; * ne:, there being four 
different forms of this matrix according as k and l are equal or not equal to 7 


and j respectively, namely: 


iJ 


Voss = (a + 6+ Us + d; 

(V) sj: = aU ;,01 for | ¥ 3; 
(V)ija; = BV ijn fork # l; 
(V) ij: O.U;;.. a zero matrix, for k # 2,1 # 3. 


B can be partitioned similarly for / # ¢ andl # j: 
E ces 
(B) 3:53 = — Ui, 
N.; 
(B) x j= 0.04.5; ; 


(B); = a U45.1; ’ 


N:j 
(B)xgt:s3 = 0.0 2,5; - 
Consider now the identity 


(9) (FWhecic = 2 2s Vee es 


Ls Ls ' vaso 
whose right-hand side can be expanded as 
(V)pa:eu(B)eusew + 24 (V) pa:su(B) use 
Is 
+ Zz (V) pq:t0(B) to: tw + _ 2 (V) pa:to6B) so: tu 


Gu St oeu 


Or as 
(V) ne:90(B) pe: 00 + » (V) pq:¢q(B) sq: tu 
¥~P 
+ ie (V) p¢:70(B) po: tu ? Zz (V) pq:40(B) so: tu - 


ox [AP a4 


These expressions are true for any values of the subscripts. 
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Applying this identity to the partitioned forms of V and B given above, and 
using the principle of (5) in 5.1 gives 


(VB) i;:45 = nila + B+ y)/n Us + €/n,; Ui + B ze — U 65.05 Vai,63 


kei 
=(njatn; B+ njy + &/n; Ug = b;Uy, say. 


Similarly, for r # 7, and s ¥ J, 


. n; y , , 
(y B) is:53 = @—— €{ aij = bi; U is,ij, Say, 
nj 


(VB), 5:55 = (nya + 038 + nyjy + €—)/n 5 U,;,4; 


> Mrj re ? 
(I B) vs: - a— U re,t) = b,; L re,iw- 
nj 


Likewise: 
(VCa)ij:435 = (nia + ngB + ngy + €/ni.Uy = ajyUy, say, 
(VC,.) ij = (nia + ni;8 + nisy + €)/ns. Uis,i0 = ee Uis.ees 


Nr; 


(VC,.) ij: = 8B — 
Nr. 


‘ ton 
U a Oe Ui5,%3 say, 


%i 17 
My 


(VC a) ifzve = 8 


, ? 
ijre = O45 Uisre - 


3 cov (T.T») = tr (VC.)(VB) 
= > a tr (VC,: VB) ij: 33 
. | 


ai b 2 Zz [tr p S (VCa) ij:re( VB) re: <3]. 


Applying the identity (9) and results (5) and (6) again, and using the forms 
of the elements of the sub-matrices of VC, and VB given above, gives 


1 cov (T., Ts) 


= ED nay {nisaijbis + 2d Mis aij bi; + 2X yj ej br3 + DY De Mee Ors des} 
‘ 3 a3) ri 


ri ej 


On substituting for the a’s and b’s, this reduces to 


3 cov (Ts, Ts) = > DS — 
. #£ 


Ni. 


2 
i 


—— (nia + 058 + nsy + e)*. 


5.9. We have now found some of the variances and covariances of T,, 7;, 
T.» and T;. These and those which follow from them by symmetry, are sum- 
marized in the following table. 
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VARIANCES AND COVARIANCES OF UNCORRECTED 
SUMS OF SQUARES 


var (T,) 


(dS nin; 
= 215 (Ems (na + mud + my +9, m+ 2 a 


1 i’ 4 My. Ni. 


ee 


var ty 


(d nin #)’ . 
HEE ni(ngza +n j8B + nijy + )/n,5]" ee atten aT 


3 3'#i nN. 5 1. 5" 


var (Ts) 

= 2 Zz > nilnijla + B+ y + €/n,)? + (n. — nia? + (n; — 058") 
var (7,) 

- 2[>- Ze ni(ns, a + n38 + ngy + e)/N} 


cov (T., Ts) 


= 200 (nat 58 + ny7 + 0 
a J 


5 NG. 


cov (T., Tab) 
= 200d nila + nyB + ny + 0'/n, +8 DD nin; — n/n} 
‘ I + 2 


cov (T., T;) 





= 32. z a (nia +58 + nisy + ©) = + B— 


n; Ni. 


x nj Ni; = ni; ) 
€ 


cov (T;, Ta) 
- 2{2 yy Nij(nija + n;8 + nsy + €)/Nn.; + a : as nii(ni. =. Nis), nj} 
j . ss 


cov (T,, T;) 





=2E DF wat ns penal ama + Bn ty 


J J 


X Ni, Ni; i ni; 
+e 
cov (Ta, T's) 
= 22 Vania +58 + nyy + 07/N 


The expressions in the above table are those of the elements of the matrix 
var (tt’) of Eq. (4). These elements are quadratic functions of the variance 
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components a, 8, y, and ¢e, with coefficients being sums of functions of the n,,’s. 
The other terms in (4) are not such as would simplify var (vv’) if the elements 
of var (tt’) as now known were inserted into (4), and therefore, as in any nu- 
merical case after calculating the expressions in the table these steps will be 
quite straightforward, it seems convenient to leave the results in their present 
form. 


6. Balanced Data. It is easily shown that the formulae developed in the last 
paragraph reduce to the well-known results for balanced data when all the 
n;; are put equal to n. For example, consider the variance of S,. From the 
Analysis of Variance table, the expected value of S, is given by 


E(S,) = (a — 1)(bna + ny + ). 
Then 
var (T) = 2{a(bna + nB + ny + ©) + a(a — 1)n’8"I, 
var (7';) = 2(bna + an8 + ny + ©)’, 
cov (T., Ts) = 2(bna + ang + ny + ©)’. 
Hence, 


var (S,) = var (T, — T;) 
= 2(a — 1)(bna + ny + «)’ 
= 2[E(Sa)]"/(a — 1) 
and with M, = S,/(a — 1), this gives the familiar result for mean squares 
var (M,) = 2(E(M,)|'/(a — 1). 


tesults similar to this can be obtained for M, and M.,, the mean squares for 
B-effects and interaction. 


7. Conclusion. Matrix methods have been developed for finding the sampling 
variances of estimates of components of variance. In earlier work (4) these were 
used for data in a l-way classification, and this paper has extended them to 
data for a 2-way classification, with unequal numbers of observations in the sub- 
classes. The estimates of the components of variance for main effects and interac- 
tion are expressed as linear functions of the corrected sums of squares and the 
estimate of the error variance component. By expressing the corrected sums of 
squares as functions of the uncorrected sums of squares, the variance-covariance 
matrix of the estimates of the components of variance has been expressed as a 
function of that for the uncorrected sums of squares, (Eq. 4). Expressions have 
then been found for the elements of this, the variance-covariance matrix of the 
uncorrected sums of squares. It has been checked that when the data are as- 
sumed balanced, i.e., all n,; equal to n, these expressions reduce to the appro- 
priate forms for variances of mean squares then having independent x?-dis- 
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tributions. Estimates with any optimum properties have not been obtained, 
and it would seem that the only feasible estimation procedure in a practical 
case would be that of replacing the variance components in these formulae by 
their estimates. 


It is hoped that these methods can next be extended to data in a 3-way classi- 
fication with unequal subclass numbers, still based on Eisenhart’s Model II 
and using Henderson’s Method I for estimation. 
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ON SOME DISTRIBUTIONS RELATED TO THE STATISTIC D, 


By Z. W. BrrnBpauM AND RoNaLpD Pyke! 
University of Washington 


1. Introduction and summary. Let X; < X: < --- < X, be a sample of size 
n, ordered increasingly, of a one-dimensional random variable X which has the 
continuous cumulative distribution function F. It is well known, [1], that the 
statistic 


(1) pt = sup {F,(x) — F(x)}, 


—excrct+e 


where /’,(x) is the empirical distribution function determined by X, , X2, --- 
X, , has a probability distribution independent of F. One may, therefore, assume 
that X has the uniform distribution in (0, 1) and, observing that the supremum 
in (1) must be attained at one of the sample points, write without loss of general- 
ity 

D, = max (i/n — U,) 


(2 i/> 
/ istsn 


where (, < U,. < --- < U, is an ordered sample of a random variable with 
uniform distribution in (0, 1). 

For a given n > O define the random variable 7* as that value of 7, determined 
uniquely with probability 1, for which the maximum in (2) is reached, i.e., such 
that 


(3) 


and write 


The main object of this paper is to obtain the distribution functions of (7*, U*), 


of 7* and of U'*. The asymptotic distribution of a, = 7*/n is also investigated, 
and bounds are obtained on the difference between the exact and the asymptotic 
distribution. 


A number of general identities, which are not commonly known, have been 
verified and used in proving the above-mentioned results. Since these identities 
may be helpful in other problems of this type, they are separated from the main 
proofs and appear in the next section. 

Received March 25, 1957; revised July 26, 1957. 

1 Research under the sponsorship of the Office of Naval Research. The second author’s 
research was also supported by the Ontario Research Foundation. This paper was presented 
at the Seattle meeting of the Institute in August, 1956. 


179 





180 Z. W. BIRNBAUM AND RONALD PYKE 


2. Some useful lemmas. 
Lemma 1. For all real a, b and integer n = 0 
(4) 2X (’) (a+ io — jt =n St 


i=0 1=( a! 


Proor. The identity 


i=0 


(5) (b — n) > (") (a+ i)*(b —i)*™*" = (a+ b)”, 
= 


for all real a, b and integer n = O (for b = n the ieft-hand term is defined as the 
limit for b — n) was proven by Abel ([2], Vol. 1, p. 102). Denoting the left side 
of (4) by f,(a, b) we have 


,weaf[n ies asi ; 7 
fn(a,b) — nfzsi(a,b) = (b — n) = (”) (a + i)*(b — i)” = (a+ b) 
1 


i=0 


by (5). For n = 1, (4) is obviously true. Assuming it is true for n — 1 we have 


n t 
n y (a + b) 
frla, b) = nfnsila, b) + (a + b)” =n! 7 a ‘ 
i=0 es 


which completes the proof of (4) by induction. 
Lemma 2. For all real a, b and integers n = 0 


n—l n—1 
n ji iii are 
(6) © (”) a+ 0% - 0" ‘= Yo (a+ b)(a+n)**. 


i=o \2 


1=0 


Proor. For b # n, the left side of (6) is by Abel’s identity (5) equal to 


-1 


> 


[(a + b)” — (a+ n)"|(b — n) 


which is equal to the right-hand side of (6), summed as a geometric progression. 
That (6) is true for b = n follows from the continuity of both sides of (6). 
Lemma 3. For all real a, b and integers n > 0 


(a — 1) — n) ~ a (") (a _ t)'(b _ or 


(7) = 

7 a (a + b)"(a+ b—n—1) — (6+ 1b — n)]. 

Proor. Since (a — 1)/(¢+ 1) = (@+72)/(@+1) —1, we may write 
(a — 1)(b ate n) > ais (”) (a + i)(b i] ain 


_o-—™n — ea 


n+1 1 + i) (a -—1 — i+ 1)" Mini a 1)" 
+=0 


— (b — n) 2 (") (a + 1)*(b — i)”. 
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Applying Lemma 2 to the first sum and identity (5) to the second, one concludes 
that this is equal to the right-hand side of (7). 
CorROLuaRrY 1. For all integers n > 0 we have 


n—l 
eth \ 
” I ("): (n— ar" = (n+ 1). 


Proor. For a 0, (7) yields 


— 1 n 1 a" — 
.pi(b—i)” = — | ——_—_— — + (6 + 1)° 
e ait) ” als os | 


and (8) follows for b >. 


3. The distributions of (:*, U'*), i* and U'*. The following notations will be 
used : for any function f, denote by [f ¢ B] that subset of the domain of f on which 
f takes values in B, a subset of the range of f; for any univariate distribution 
function, F, let P» denote the n-dimensional product measure determined by 
the probability measure associated with F; without a subscript, P will be that 
measure determined by the uniform distribution function; the value of n, though 
suppressed in the notation, shall always be made clear by the particular cir- 
cumstances of its use; furthermore, for7 = 1,2, --- nandu é (0, 1], set 


P; re’ = Ji; G*(u, 7) = PiU* < u, .* Ji; 


H*(u) = P{U* Ss ul; 


for real x, |x] denotes the greatest integer less than z. 

All the theorems of this section are stated at the outset, and the proofs are 
then presented in what appears a natural sequence. 

THEOREM 1. The probabilities for i* are given by 


n—l 
n 1 ‘i sini 
(10) p=n" > (") (n — i)” 


imecst t1\i 
for j = |, 2; > 
THEOREM 2. The joint probability distribution of 7* and U* is given for all 
u € [0, 1] by 


k 
G*(u,k) = > K(u, j) (k = 1 “++ n), 


=1 


where 


K(u,j) = P(U* s u,i* = jj] = 


Pi if nu =j 


n = (*)« —u)”*"G — nun? > (°) (nu —¢— 1)" “(t+1)™ 


i=) t=[nu) t 


if nu <j. 
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THEOREM 3. The random variable U* is uniformly distributed over [0, 1). 
Proor oF THEOREM 2. For 7 = 1, 2, --- , n, consider the events 


B; = (U* < u;7* = J] 


= [U; < u;j/n — U; > i/n — U;, ( ¥ 3)). 
Employing the transformation 


y= Us; B= U-U;, GHD 
one obtains 
B; = (2; S u;Z; > (¢ — 9) /n, (t # j)). 


Setting Z = (Z,, Z:,--- Z,), the joint probability element of Z for 7 fixed is 


dH (Z) = n!dZ,dZ, +--+ dZn 


for 


[(-2, S$ 2<2:58-:-S3Z,i808 Zn S-:- SZ, 


IlA 


— Z;] 


43 


and zero elsewhere. Assume u and j fixed such that nu <j and 1 <j 


<< es 
Writing \ = [nu], one has 
K(u,j) = | dH,(Z) 
~B; 
0 Zj-1 eZ y—r+1 Zj-. eZ 
= nt | dZ 5-1 dZ j-2 ne | dZj-» dZ; “ee ade, | dZ, 
—l/n —2/n —A/n —1 eo 
u 71—Z; -Zn -Z5+2 
| dZ; | dZp | @Za.-- | dB. 
—Zy “(n—7)/n “(n—-j—l)/n “ijn 
By the linear transformation 
are for 7 = 12,°-°*,nm—J,; 
t= 41 — Z; fort =n—-jt+l. 
Lh + Ze -o+ts fori=as-j+ 2,28 -j3 +4, ---,8, 
one obtains 
1 


i (u, 7) = n! | din -*: | AXLnd11 | dZn-» 


~(n lijn “t(n—A)/n “l—u 
(12) 


pin—37+2 "Zn 


| din—j+1 i Atsj hte | dxe | : dz. 


“(n-—j)/n Y2/in “ljn 


Denote by J; the result of integration up to and including that with respect to 
x, . By repeated integration one finds 


nj n—-j-1l 
J as Tn—j+1 Cn—j+1 
vn-j — = 





(n—j)! nm—j—D! 
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Hence 


a tetas 
Jn-i41 = gcTs iy! - ta = i + Wu, j), 

where fori = 0,1, 2,---,J, 
(1 — u)”* 


. oo. ae 
Wu, J n(n —i+ 1)! 


(nu —i+ 1). 


tepeated integration gives 


n—h n—r—-1 


a a 3 ‘ (x “wi 1 i. a 
| a W (u, 7) ————————__ . 
ts (n — X)! spent + Be (u, 3) (1—A-— 1)! 





By properties of the binomial expansior. one obtains 
n—X\ n—d—1 nm—r 


IE a aa ie wt en (" . ‘) (taayi — 1+ 4)" 
(n — A)! 


n(n —r»— 1)! (n — r)! 8 
(A — uy" *"[u — (8 + A)/n) 


and therefore 


J ] = /a =i (r ail 
“ ee S-waek 6 JCC 


‘(1 — u)**"*"[(s + A)/n — ud. 
The identity 


“1 Za -Zn—h42 


as. | dXn-1 see | : (wu — 1+ Seeesy drn-r+1 see drn-1 dr, 


“(n—-2)/n “(n—A)/n 


A-1 


= 7 =i n e+h | (na) oa = ie ‘) (nu ig A: te ae de t)] 


t=( 


is easily proven by induction on \. Applying (5) one shows that the right side 
of this identity is equal to 


a > rs a (nw 4 — 1" + 0. 
Hence it follows from (13) that 
K(u,j) = niJn 
— n ' roe are (str 
, (7 ,)a-w (s+ — nu)n > ( ‘ ) 
‘(nu —t— 1°" (t+ DZ 
which is the expression in (11) in the case nu < j. 


With a few minor changes, the above argument may be also used to prove 
Theorem 2 for 7 = 1 and j = n. For example, in the discussion preceding (12) 
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one has to define Z) = Z,4; = 0, in (12) Jo = 1, and in (13) 2,4; = 1. Since 
j = 1 has?) = 0, the theorem in this case follows directly from (13). 

To complete the proof of Theorem 2, it remains to consider the case of u = 
j/n. Since D; = 0 then, by (3), U* < i*/n. This implies that for u > j/n 
we have 
[U* <u, 7* = 37] = [(U* S 


L 


hence the first statement of (11) is true. 
Proor or THEOREM 1. We have 
pi = Pl* =j S j/n,i* = 3} 
i= K(u, 3) 
. tC ya — j/n)**"G — jn * > @) (j-t#-—1)" e+ 0) 
i=j t=)—1 


which, after neglecting zero terms and interchanging summations, becomes for 
s=1- 1 


pi = Nn EC) e+ tS ("7 \j-t- a -pntett-D 


t=) s=0 
n ‘ n t-1 n—t—1l 
= 7 = (")e+0 (n—t — 1) 


t=j 


by a direct application of the binomial expansion. Setting i = n — ¢ — 1 for 
t <n, one obtains 


n—j—1 
p=n*(n+ 1 —n™ ; ri (") i‘(n — i)”, 
+=0 


the last sum being zero for 7 = n. By Corollary 1 it follows that for all j, 


n—-l “XS J 
- 1 n\{2 ; t\n—s-1 
not 2 itive ea 


This completes the proof of Theorem 1. 
Proor oF THEOREM 3. WithA = [nu] as above, it follows from Theorem 2 that 


H*(u) = >> K(u, j) 
?7=1 


= dp, E> > "ya — 9)" *"G — aaa 


j=A+1 n i=) 


> +) (nu — ¢-1)"‘¢4+ 1). 


t=d 


Interchanging summations in the last term according to the pattern 


~ Lh= > 6-nNL=-LLG-» 


7=A+1 t=) t=r t=A+1 t=\ i=—¢t 
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(the second step follows because the index j does not appear in the summand; 
the last step follows since at 7 = \ the summand is zero), one obtains 


“ n 
H*(u) = > ptn” 7 (") (¢+ 1)" 
=I 


t=) 


, = (i — vA) — nu) abs (n — nu)* "(nu — t — 1). 


w=t 


(14) 


Using known properties of the binomial expansion, one can show that, whenever 
n—t#1 
n—t 


2. f(t — A)(t — nu) + s(2t — A — nu) + 8} 
s=0 


(15) 


(" - ‘ (nu — t — 1)°(n — nu)” = —(t — Ain — ¢t — 1)”. 


When n — ¢ = 1, this sum reduces to 
1 


> {(n — 1 — rA)(n — 1 — nu) + 8(2n — 2 — dX — nu) + 8°} (-1) 
(16) *=0 


= —(n — 1 — A) — n(l — uw). 


Substituting (15) and (16) into (14), while setting 7 — ¢ = s, one obtains 


n—l 


» 
H*(u) = Dp— 0 (") (¢—rat+ Dm -t- yr 
3=1 


td 


(17) 
— (1 — u) — n(n — )(n + 1)™™. 


Employing Theorem 1, Corollary 1 with 7 = n — ¢ — 1, and Lemma 2, one 
concludes from (17) 


n= i “a ~i-k—k 
H*(u) = n” 7 ——$_—__— ‘ : - 1)” = n” dX ——s 


i‘(n — i)” ** — (1 — u) — (n — dA)On 


This completes the proof of Theorem 3. 
A consequence of Theorem | is the following 
Coro.uary 2. For all integers n > 0,7 > 0, 


0<m <pe<-++ <p, <i, 
—t-i—l 
lim np; 
: et 
lim np»-; = € — ——.. 
im mpi = ¢ — BT 
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Proor. The first statement is evident from (10); the second follows from (10) 
by applying Stirling’s formula; and the third follows by applying Stirling’s 
formula to the expression 


( s \n—l j-1 
(n + 1) 1 s\ . nin’ 
DPr-j = —— ie ie =( i(n —1)”” , 
n” m—t+i1\e 


which can be obtained from (10) by Corollary 1. 

Thus the statistic Dy places more weight upon the larger observations than 
on the smaller ones, in the sense that the maximum deviation between F and 
F, is more probable to occur at X;4; than at X, for k = 1, 2,---,n — 1. 


4. The asymptotic distribution of a, = i*/n. Writing U% instead of U*, w 
have according to Theorem 3, 


(18) P[U% < uj = Hi(u) = u, 


Since the Glivenko-Cantelli theorem ([3], p. 260) implies that D, converges in 
probability to zero, it follows from (3) that 


(19) a, — UX —0_ in probability. 


From (18) and (19) one can conclude that a, is asymptotically uniformly dis- 
tributed on [0, 1]. 

The following theorem contains more specific statements on the asymptotic 
behavior of the distribution of a, 

THEOREM 4. For every positive integer n we have 


n—l 


(20) E(a,) = + tn! > . s 


i=o 2! 


a = n—1 Z : 
(21) r— yf ont sn <Pri{a,.<z}<S2r forO<2z 


i=0 2: 


Proor or THEorEM 4. From Theorem 1 we have 


n—l 
E(an) oS i Zz + i (7) i‘(n — i)” 


j=l i=—n—j t 


Mi *) + Sr O (Pm - 0" P 


and this by Lemma 1 yields (20). To obtain the upper bound on Pr {a, < 72} 
in (21) we note that 


[nz] 


G(x) = Pr {an < x} _ D di 


u=l 


and in view of Corollary 2 this must be <z forall 1/n <2 31. 
To obtain the lower inequality in (21) we need 
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Lemma 4. Let X be a random variable with c.d.f. F, such that F(0) = 0, 
F(1 + 0) = 1, F(x) S rfor0 S x S 1. Then 


F(s) => s — V/2E(X) — 1. 


Proor or Lemma 4. We have 


1 fl 
K(X) = [ X dF(X) >1— | F(X) dX 
“0 


“0 
- F(e#) os el 
=1- | Xdx — | _ Fs) aX — | xX dX = 3{1+4 [s — F(s)}} 
: ~ F(s) “a 
and this implies (22). One verifies directly that, for given s and F(s), equality 
is attained in (22) when F(t) = ¢t for 0 = t S F(s), F(t) = F(s) for F(s) 
‘ss, F() = tfore Sts 1. 
According to the upper inequality in (21), Pr {a, < zx} fulfills the assumptions 
of Lemma 4, which together with (20) yields the lower bound of (21). 
It may be noted that by an application of Stirling’s formula one obtains from 
(21) 
(23) 0 < x — Pr{a, < z} = O(n), 


and that (20) together with (3) yields 


n—l 


(24) E(Dt) = 2"n*"n! > =. 
1! 


1=0 
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ON SEVERAL STATISTICS RELATED TO EMPIRICAL 
DISTRIBUTION FUNCTIONS' 


By Meyer Dwass 
Northwestern University 


1. Introduction. Let X,,--- , X, be nm independent random variables, each 
with the same continuous c.d.f., F(x). Let F,(x) be the empirical c.d.f. of the 
X,’s. We consider the following random variables, 


U, = w{ F(t): F,(t) — F(t) > 0}, 
D,= sup (F,(t) — F(t), 


—wct<e 


Vn inf {F(t):F,(t) — F(t) = D,}, 
—B<ct<e 
where {/(t): } denotes the set of values of F(t), for which ¢ satisfies the con- 
dition after the colon. These are sets in the interval (0, 1). In the definition of 
U,,u{  } means Lebesgue measure. Obviously, there is no loss of generality in 
supposing that the X,’s are uniformly distributed over (0, 1) and hence 


u{t:F,(t) —t > 0}, 


= sup (F,(t) — 0), 
O<t<1 


{t:F,(t) — t = D,}. 


In [5], Kac showed that as n — «, U, has an asymptotic distribution which 
is uniform over (0, 1). A stronger result was recently obtained by Gnedenko and 
Mihalevié in [4] in which they showed that for every n, U, is uniformly distrib- 
uted. Birnbaum and Pyke in a forthcoming paper [2] show that for every n, 
V,, is also distributed uniformly over (0.1). The methods of [2} and [4] are com- 
putational and the purpose of this note is to derive the uniform distribution of 
U, and V, by a short method which employs results of E. 8. Andersen and a 
well-known relationship between the Poisson process and uniformly distributed 
random variables. In Sec. 3, a generalization of these results is given. 


2. Proof of uniform distribution of U, and V,. The proof depends on two sets 
of facts. The first refers to the Poisson process. By this we mean the stochastic 
process, X(¢), with independent and stationary Poisson distributed increments, 
defined for ¢ = 0 and such that X(0) = 0. For this process, it is well known that 
given that X(1) = n, a positive integer, then the conditional distribution of the 
discontinuity (jump) points, 4 S “# < --- S t, of X(t),0O < ¢t J 1, is that 
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of the ordered values of n independent, uniform random variables. Another way of 
saying this, somewhat roughly, is that the conditional distribution of the random 
function X(t),0 S ¢ S 1, given that X(1) = n, is that of the empirical c.d_f. of n 
independent, uniform random variables. For a proof of these facts see p. 400 of 
[3]. The second set of needed facts is contained in a paper of E. S. Andersen [1], 
namely: 

LemMA (ANDERSEN). Let },, Y2, --+ be independent and identically distributed 
random variables. Make the definitions 


t 


So = 0 (a.s.), S; = : Xj, 


j=l 
L, = smallest i for which S; = max (0, S,, --- 


N, = number of positive terms in S,,--- , S,. 


1 


P(L, = m} Sys l m|Sra1 = 0) = ; 
1 1 } r he ¢ 


form = 0,1, --- , rtf and only if 


(3) P(S; = S13 = (0) = 0, (i = 1 = °°" . r). 


P= 


We remark that Andersen’s results are much more general, but we state them 
in a form convenient for our applications. 

TuHeoreM 1. U, and V, are each distributed uniformly over (0, 1). 

Proor. Consider the Poisson process X(t), 0 S ¢ S 1. Divide the 
interval (0, 1) into the r + 1 parts (0, 1/(r + 1)), (1/(r + 1), 2/@ + 1)),---, 
(r/(r + 1), 1), where r + 1 is greater than n and its a prime number. (Whenever 
we state r — x we will understand that r + 1 goes through the primes.) The 
increments of X(f) in these intervals are independent and identically distributed 
Poisson random variables. We denote these increments by W,, W2,--- , Wras, 
respectively, and define Y; = W; — n/(r + 1),i = 1,---,r+ 1. The Y,’s are 
independent and identically distributed. We want to show that they satisfy (3) of 
Andersen’s lemma. This is so because S; = S,,; = 0 implies that (r + 1)- 
X(i/(r + 1)) = nt. This cannot hold since by the primeness of r + 1, n must be 
a factor of X(i/r+ 1), but since X(t) is non-decreasing this would mean 
X(t/(r + 1)) = n, or r + 1 = 7, a contradiction; thus (3) holds. Under the 
condition X(1) = n, X(t) is distributed like F,(¢), for s S ¢ S 1. Hence we can 
define U, , V, for X(t), 0 < ¢ S 1. We next observe that when X(1) = n, then 
A y L, “ B 

1| 


(4) Pe as 
. ee (or 


r+ 1’ 


where A, B are constants which depend on n but not on r. Thus, under the con- 
dition X(1) = n, both absolute values in (4) converge in probability to zero as 
r— «, Since N,/(r + 1) and L,/(r + 1) are asymptotically uniformly dis- 
tributed over (0, 1) as r — ~, this completes the proof. 
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3. Generalization. A generalization of Theorem 1 can be given which may be 
of interest. Let 


Am,s*** > atm 3 °** hy * 5 bes 5 


ben = m + --- + m independent random variables each uniformly distributed 
over (0, 1). Let F(t), --- , F(t) be the empirical c.d.f.’s of each of the k sets of 
variables and define 


F(t) = pF (t) + --- + oF“ (O), O<t<1, 


where p = (91, p2,°°* , Pk), Pi > O, pr + p2 + -*- + pm = 1. In the special case 
where p; = n;/n,t = 1, --- , k, then F,(¢) is the empirical c.d.f. of the combined 
set of n variables. Otherwise F(t) can only be described as a nondecreasing ran- 
dom step function on (0, 1) such that F,(0) = 0, F,(1) = 1. Nevertheless, 
random variables U, , D, and V, analogous to U, , D, and V, may be defined for 
F(t) exactly as was done in (1) for /’,(¢); (replace F,(t) by F,(t) in (1)). In the 
following theorem we understand them to be so defined. 

THEOREM 2. U, and V, are each distributed uniformly over (0, 1). 

Proor. Let X(t), X(t), --- , X(t) be k independent Poisson processes and 
define X(¢) = p,X,(t) + --- + p,X;,(t). Then X(¢) is also a process with station- 
ary independent increments. Define now p = (1, p2,--- , px), 


(0, = pit:X(t) — X(t > 0,0 < 1}, 


| D, = sup (X(t) _— X(1)t), 
, stsl 


0 


iv, = inf {t:X() — X(t = 


\ 1 
We suppose first that 
(5) pi = @,/a,--* , pe = a/a, 


where a; , --- , @& are positive integers, and a, + --- + a = a. lf bisa number 
such that P(X(1) = b) > 0, then U,, V, are each uniformly distributed over 
(0, 1) given that X(1) = b. The proof of this fact follows exactly the proof of 
theorem 1. In particular the definition of the p,;’s by (5) allows a verification of 
the condition (3) of Andersen’s lemma which is exactly analogous to that done in 
the proof of Theorem 1. Since the p;’s as defined by (5) are dense in the set of all 
possible p,’s, it follows by a simple continuity argument that the conditional 
distribution of U,, V, given that X(1) = }, is uniform without the restriction 
(5). If X(1) = o:X,(1) + --- + pX,(1) = b, this need not uniquely determine 
the values of the X,(1). That is, there may be two different sets of positive or 
zero integers, %1,°°:, % 3 %1,°°*, Ye, such that 


Pidi t+ +++ + pete = pyr t -+- + eye = O. 


On the other hand, there is a dense subset of the k-dimensional unit cube where 
this cannot happen, namely any dense subset, each point of which has rationally 
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independent coordinates. Thus, in such a dense subset X(1) = pyn; + 


T “ee. > 
pxm, if and only if z,(1) = m,--- , 2:(1) = m, for a set of p,’s which are dense 


in the set of all possible p,’s. For such p,’s the conditional distribution of OU, and 
V, given that X,(1) = n,,--- , X,(1) = nm, is thus uniform. This holds also for 
the exceptional p,’s by a continuity argument. This completes the proof since 
F(t), --- , F(t) are distributed like X,(t), --- , X,(t) for 0 < t S 1, under the 
conditions that X,(1) = m,---, X,(t) = nm. 

4. Concluding remarks. The linear combinations of Theorem 2 are convex 
‘pi +--+ + pe = 1) and positive (p; > 0). The convexity, as well as the strict 
positivity, is a matter of convenience. The condition of non-negativeness, how- 
ever, cannot be removed. It is easy to verify directly, for example, that the 
theorem does not hold for 


F(t) = pF?) +: pF), 


if p, > O and p. < 0. The trouble arises because the condition (3) of Andersen’s 
lemma fails to hold. 
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ON THE EFFICIENCY OF ESTIMATES OF TREND IN THE ORNSTEIN 
UHLENBECK PROCESS 


By Cuaruotre T. StrRreBEL 
University of California, Berkeley 


1. Summary. The problem is that of estimating the trend of a normal process 
when the trend function is known up to a finite number of coefficients. That is, 


yi = % + f(d), 0OsisT, 
where 2; is a normal process with mean zero and covariance function 
E[X. , Xe] = C(u, v) 


and 
f®) = koi) + --- + k(t). 


The ¢,(¢) are known functions and the k; are to be estimated. 

The standard procedure in such a case is to derive the estimates by the max- 
imum likelihood method. However, if the covariance function C(u, v) is not 
completely known, this is usually impossible, and it is essential to find an al- 
ternative procedure. The method of least squares has been proposed by Mann 
[1]. The estimates obtained by this method are independent of C(u, v) and have 
the additional advantage of being easily computed. Mann and Moranda [2] 
showed that for the Ornstein Uhlenbeck process the asymptotic efficiency of the 
least square estimate relative to the maximum likelihood estimate is one, in 
the special case that the ¢,(¢) are polynomials or trigonometric polynomials. 
Mann defines the efficiency é(7’) of an estimate f(t) 


El [ if) — fioy it| 


eT) = —2,_____ 
E | Lio) — for at| 


where f(t) is the maximum likelihood estimate. For the cases that shall be of 
particular interest—the Ornstein Uhlenbeck process with f(t) a linear unbiased 
estimate—Mann and Moranda [2] have shown that €(7) < 1. 

In the present paper the asymptotic efficiency of the least square estimates 
will be computed for a wider class of functions ¢,(¢). It will be shown that except 
for a special case just slightly broader than the one treated by Mann and Mor- 
anda, the asymptotic efficiency is actually less than one. Thus except for this 
special case, the least square estimates could be improved upon. An alternative 
estimate k,(a) is proposed. It will be shown that for a = 8, where 8 is the true 
correlation parameter in the Ornstein Uhlenbeck process, the estimates k;(«) are 


’ 
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asymptotically more efficient than the least square estimates, and in fact as 
a — 8 from above the efficiency increases (strictly) to one. 


2. Introduction. The least square estimate is obtained by minimizing the ex- 
pression 


? ” 
[ (y, — f(y dt 
0 


and is given by 


e ; T 
B= DG) [ 60nd 


j=l 


where 


T 
(1) G(T) -[ O(t)o,(t) dt. 
0 


The maximum likelihood estimates k; minimize 


T ,f 
[ [ [yu — fw ye — f(v)|C'(u, v) du dv 
“0 “0 


and are given by 


. 


T pf 
i, = 7. ©” (T) [ [ o;(u)y,C (u,v) du dv, 
0 0 


j=l 


? pf 
’,(T) = [ [ o(wo (WC ‘Cu, v) du dv. 
‘0 +o 


It will be assumed that the ¢,(/) and C(u, v) are such that these integrals exist 
The efficiency of the least square estimates can now be computed. 
aT) = (G(T)@"(7)| 
(w(T)G-(T)|’ 


where 


T pf 
(3) v(T) = [ [ (wo, (v)C(u, v) du dv. 
0 0 


The trace of the matrix is ¢. 
It will further be assumed that there are functions 7,(T) such that the limits 
bet a D. w 
rox HA(T)H;(T) on 
: #,;(T) 
pea oo ® 
1 AT) AAT) 
pole? 
roo Hi(T)H;(T) 


Y 


9 


Vi; 
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exist and are positive definite matrices. The asymptotic efficiency then is 
-1 
e ‘ — (Ge) 
(5) f = lim é(T) ~ ° 
T+20 t((wG-") 


Necessary and suflicient conditions that € = 1 will be found for two classes of 
G,, v. The first, which includes the cases treated by Mann and Moranda, re- 
quires that G, &, ¥ be of the form 


, 
aah, 


n=l 


N 


v=> : C.. 


n=1 Cy 


where the G, are positive semi-definite matrices and the c, are distinct positive 
real numbers. The second requires that 


& BGB"* 
v = B'GB"'+0C, 


(7) 


where B is positive definite, and C is positive semi-definite. 
Results will be applied to the case that x, is an Ornstein Uhlenbeck process and 
the ¢,(f) are of the form first 


Yi 


N 
ot) = > t (dine SIN wt + Dine COS w, b) 


n=l r=] 


and second 


o(t) = &*, a; > 0. 


When the covariance C(u, v) involves some unknown parameters an attempt 
can be made to estimate them along with the /,; by the maximum likelihood 
method. However, this frequently leads to equations which cannot be solved. 
In this case, a natural procedure is to make an estimate C*(u, v) of C(u, v) by any 
convenient method and then use the maximum likelihood estimates of the /, 
based on the covariance C*(u, v). 

‘or the Ornstein Uhlenbeck process 


\ 2 —Blu—v| 
Ca. =ce~" -s 


- (a? ifn = vv, 
C¥(u,v) = 


\0 otherwise. 
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This covariance function yields the least square estimates. If the true value @ is 
replaced by a, a family of estimates is obtained by this method. 


(8) 
a T) 5 | esr Yr + ¢;(0) Yo+ - rT $; “(r) dy. +a [ oj(Uy. at], 


j=l 0 
where 


$,,;( T) 


= | er p(T) + {0)¢,(0) + . [ o:(t);(t) dt + [ $1(t);(t) av]. 


Clearly 


lim F,(a) K;, 


k(8) = k,. 
3. Efficiency of estimates for G, @, ¥, of form (6). Assume that G, %, Vv 
defined by (1), (2), (3), and (4) are the special form (6). Then from (5) 
_ (4G — = _ Ge" (ev — GG) 
t(vwG iva 





(oy — GG) + (¢¥ — GG)’ 


= > noe a. ~-¢a.+  G.Gn - G.G.| 


n=l m=1 


=> e & = = 6) 6G, 


n=1 m=1 Cu Cm 


GG, are positive semi-definite and 
all m #n. 


Thus, #¥ — GG is positive semi-definite. In order that 1 — é = 0, it is necessary 
and sufficient that 
@¥ — GG = 0. 
This is equivalent to requiring 
“UGnGr) = 0, all m ¥ n. 


This result will be stated as a theorem. 
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THEOREM 1. Jf G, , VW are nonsingular and of the form 


G=>>4G,, 


n=1 


. 
2, Cc, Ge; 


n=l 


a | 
v¥=) -G., 


n=1 Cy 


where the G, are positive semi-definite matrices and the c, are distinct positive real 
numbers, then 


_ Ge") 


(WG) 
if and only if 
uG,G.) = 0, allm # n, 


lor the Ornstein Uhlenbeck process the theorem can be applied to obtain the 
special result. 
THEOREM 2. Let 


Y= ua T fd), 
where x, 7s an Ornstein Uhlenbeck process with mean zero, and 


f() = kigi(t) + +++ + kp(t). 
Suppose 


\ Yi 
b(t) = Do Do t(aine Sin wn t + Dine COS wy 


n=l r=1 


are such that 


N 


* 7 4\ 1 : ' 
@;(t) = t”* Zz. (Giny; SIN wy t + Dy; COS wy t 


n=1 


are linearly independent. Then the asymptotic efficiency of the least square estimates 
of the k; is one, if and only if 


z GinyAimy 
(11) >> dinydimy 
z binqDiny 


for ally and m ¥ n. The sums extend over alli for which y; = y. 
Proor. Let H;(7’) = 77**"”*. The only terms which appear in the limits (4) 
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will be those of maximum order, that is, those of the ¢*(¢). Denote ain, by Gin 
and b,,4, by bin. Then G, #, ¥ can be computed and are of the form (6) with 


Gin Ayn + Din Din 
CVs Ve TT Bet ¥ 
8 + ws 
—- 


The G, can easily be shown to be positive semi-definite. Thus by theorem 1, 


é = 1 if and only if 


Gaij _ 


Cn 


“uGG,) = 0, allm # n. 


1(G,.G : > > Ain Aim Ajn Ajm Ain h on din Vin + im Din Aim Din - Din Dim Din Dim 
ass a ms (ys + vi + 1)? vi 1)? 5 2)? 


(y +6 + 1)2(7!)2(6!)? — 


aie : rR A mny 4imns “TT Cone — 7. Cui Cail + = Bans 
Y é 


+ and 6 are summed over all distinct values of y; and 


A many — > AinQim , 


Dd Dindim 5 
e 


mny > AinDim . 


The summations extend over all values of 7 for which y; = y. Since 


1 
(+6 + 1)? 


is a positive definite matrix for y, 6 ranging over distinct integers, 
UG»G,) = 0, allm # n 
if and only if 
Amny = Bany = Cony = 0 
for all y and m ¥ n. 
Thus, unless the special conditions (11) are satisfied, @ will be strictly less than 
one. For example, 
S(t) = ky + ke sin t + ky sin 22 
can be estimated efficiently by least squares, but 
f(}) = ky + ke (sin ¢ + sin 22#) 
cannot. 


Grenander and Rosenblatt [3] in Section 7.6 obtain results very similar to those 
of Theorem 2. 
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THEOREM 3. If y, and $,(t) are as in the hypothesis of theorem 2, then the asymp- 
totic efficiency &(a) of the estimate k;(«) (8) is monotone decreasing from 1 at a = B 
toéasa— ~. If é ¥ 1, then it is strictly decreasing. 

Proor. First the efficiency of the £;(a) estimates must be computed. 

-T 


E(T,a) = | | (fa(t) — f(t)? in| 
-0 


= t[Zi,ca k; a) (T)G;;(T)}. 
= E|(ki(a) — ki) (ka) — k;)] 


5 : 
kj (a)kj(a) 


ot) | 5 wer +8 ar) | ain), 
4 a 


Pe A UA 
Waij(T) (d(u) + = di(u))(O;(v) + = o;(v))e du dv, 
Jo Jo a a 
and ®,(7) is defined by (9). 
For ¢;(t) as in theorem 2 and 
BAT) «= T°". 
$,;;(T) 


= tim PD tat + Fa 


ree A (T)HAT) 
sat l (altsry 7 
b= 506+), 


? 
| i(t)p;(t) dt 
fo AAT) 
Vais(T) _ (a — 6) 


23 
sim _-o. = = Gi 
ree HT)H,(1) 7 ro 


Wass 
1 , 9 , 
b, = = [(a’ — B)G + 284). 

A = (a — B)G + 296. 


lim E(T, a) 


T-2 


t{A*A[(a — BY + 4B8(a? — B’)G + 46'6)}, 
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OE(a) » 2 re ; 
—— = 8aB(a’ — &)HA “AGA &G' wv — &°G)). 
0a 
It follows from (10) that 
G'¥ —#°G 
is positive semi-definite. Thus, for a > 8, the derivative of F(a) is nonnegative 
and E(a) is monotone increasing. If é ¥ 1, then 


UG '¥ —@'G) > 0, 


and at least one characteristic root must be nonzero. Since A~’A™'GA~'® is 
positive definite dE (a@)/da will then be positive, and E(a@) is strictly increasing. 


elf 


(f(t) — f(y at| 


lim -— 


(fa(t) — f(t))° at| 


et E lf 


t(@G) 


~ E(a) 
4. Efficiency of estimates for exponential ¢,(¢). Assume G, #, ¥ are of the 
form (7). Then 


(CG) 


_ t(w¥G" — &'G) _ 
t((wG-)’ 


aoe “UvG) 
and é€ = 1 if and only if C = 0. 
Let 
¢d,(t) 
and 
H (a) 


where the a; are positive and distinct. Then 


1 
a; + a;’ 
(8 + a,)(8 + a,) 
a; + a; ; 
28 1 


Gi 


$;; = 


on sanmmnge en -+- 
(8 + a,)(B + a;)(a,; + a,) 


_ (B+ a,b; 


B;; 


(8 + a,)(B + a;)° 
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and hence 
| © 


Since the least square estimates are not asymptotically efficient, it is of interest 
to compute the efficiency of the k;(a) estimates. In this case 
(a+ a)(a+ aj) _ 


= 2 AGA, 


®,,; = pee oe 
2a(a; + a;) 2a 


where 


(a + a; )6;; 


| Awa 


a 


ttG"A'GA'G"[(e — &)¥ + 28G}} 


VIS (a) ee 
—— = 2%a — pA “GDI, 

Ja 
where 


i an B)a, + (a “+ B)a; + 2a, a 


(a; + a;)(a + a,)2(B + a;))(a + a;) 


' 
For a > 8 this matrix is positive definite. Thus, OE (a) da 1s positive, and E(a) 


is strictly increasing. Thus, for a > 38, the k.(a) estimates are more efficient than 
the least square estimates. 
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UNBIASED ESTIMATION OF CERTAIN CORRELATION COEFFICIENTS' 


By INGRAM OLKIN? AND Joun W. Pratt 
University of Chicago 


1. Summary and introduction. This paper deals with the unbiased estimation 
of the correlation of two variates having a bivariate normal distribution (Sec. 2), 
and of the intraclass correlation, i.e., the common correlation coefficient of a 
p-variate normal distribution with equal variances and equal covariances 


(Sec. 3 


In both cases, the estimator has the following properties. It is a function of a 
complete sufficient statistic and is therefore the unique (except for sets of proba- 
bility zero) minimum variance unbiased estimator. Its range is the region of 
possible values of the estimated quantity. It is a strictly increasing function of 
the usual estimator differing from it only by terms of order 1/n and consequently 
having the same asymptotic distribution. 


Since the unbiased estimators are cumbersome in form in that they are ex- 


pressed as series or integrals, tables are included giving the unbiased estimators 
as functions of the usual estimators. 

In Sec. 4 we give an unbiased estimator of the squared multiple correlation 
It has the properties mentioned in the second paragraph except that it may be 
negative, which the squared multiple correlation cannot. 

In each case the estimator is obtained by inverting a Laplace transform. 

We are grateful to W. H. Kruskal and L. J. Savage for very helpful comments 
and suggestions, and to R. R. Blough for his able computations. 


2. Correlation coefficient. Let (x;. :),--- , (t~, yw) be independently dis- 
tributed, each bivariate normal with means yp; , we, variances oi, 03 and cor- 
relation p. The problem is to estimate p unbiasedly in the cases (i) yy; , we known, 
o;, 62, p unknown, and (ii) all parameters unknown. 

Sufficiency and invariance suggest that we confine ourselves to odd functions of 
r, Where r is the usual sample correlation coefficient in either case, namely, 


a ay 
Ps Xi — BY: — Be) 
V Le (2; — pa) ~ (Ys — Me) 
where (4) , &2) equals (4; , we) in (i) and (Z, 9) in (ii). 
2.1. Derivation of the unbiased estimator. The density of r is 


an-2 


ani? a (o-ay2 2 (n + k\ (2pr)! 
(2.1) (e) 2 oe. (1 — ZY" — 9)?" ‘ag, i el, a il 
; ’" aI'(n — 1) . , a ne ‘ : ( 2 k! 
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where the degrees of freedom are n = N and N — 1 in cases (i) and (ii). (We 
assume n = 2, the case n = 1 being degenerate.) The condition E{G(r)| = p, 


i.e., G(r) is unbiased, is equivalent to 


gn-2 cS 


+ k\ (2p) 2\ (n—3) /2_h a) 
ae ("4 ') Soh [ Gir) — ry dr = (1 — p)”'’p 


. (5 +i) “i 
j r ( 


Comparing coefficients of powers of p, we find that G(r) is indeed an odd function, 
and that 


rl = - 


Lr(27 + 2 


(n+ 274+ 1 


Using the identity (e.g., [3, 12.4.4]) 
l~wWHse ) 2p—1 
Vrl(2p) = T(p)l(p + 
and making the substitution r = exp (—3 ¥), we obtain 


3 
2 


[ Gle “hy (4 — Saye mi ; “¢ : dy T (" ia ) \ 

- 32 (" + | 

I i 
») 


As a function of 7, for n 2 2, the right-hand side is the unilateral Laplace trans- 


form of 


—ly —y)\ (n—) /2)-1 1 :o~ st. - 
¢ (1 — « ) F (555) 5 1) 


[1, p. 262 (7)], where F is the i nines function 


T(a + k)T( (8 + k)P(y) x 
~ T(a)T(B) I'(y + k) ke 


(22) 


Therefore 


tz.) 
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"ea + @) 


(1 + tr)? 


[2, 2.12 (1) and (5)]. 
2.2 Properties of the unbiased estimator. G(r) is an odd function of r by (2.3), 
and is strictly increasing since, in (2.5), r(1 + ¢r’)*? is strictly increasing in r 
for each value of t,0 <¢< «.¥Forp = +1,G(r) = r = +1 with probability 1, 
and consequently, —1 S G(r) S 1, which is the range of p. 

As remarked before, G(r) is the unique minimum variance unbiased estimator 
of p. 

To obtain the asymptotic distribution of G(r), we note that, by (2.2), 

F(a, B37; xz) = 1+ 2O(1/y) 

asy — * (uniformly in z for z in any bounded set), so that G(r) = r+ O,(1/n). 
Therefore ~/n[G(r) — p} has the same asymptotic distribution as ~/n[r — pl, 
which is N(O, (1 — p°)°), [3, p. 366). 

In order to facilitate the use of the unbiased estimator G(r), Table 1 gives 
G(r) and (for easier interpolation) G(r)/r for r = O0(.1) 1 and n = 2(2) 30. The 
computation was carried out by means of the recursive relation 


zF(}, 3; 7 + 1; 2) 


~ E i aap | Wee — 1)F(, 3; 7; z) + (1 — z)FG, 4; 7 — 1; 2)], 


[2, 2.8 (30)], together with the initial conditions 
F(3, 4;4;2) = 1/1 — gz, 


ds 
2 


F(3, 3; 3; 2) = are sin VW2/V72, 
{2, 2.8 (4) and (13)]. 
Approximations for G(r) can be obtained from the expansion (2.2), which gives 
G(r) | i-r 9(1 — r’)? 


92 } —3 
a r *ia- 1) * a — 1) oe 

(2.6) gives G(r)/r within .01 for n 2 14 or .001 for n 2 36 if two terms are in- 
cluded, and within .01 for n = 10 or .001 for n = 18 if three terms are included. 
The neglected terms in the first line of (2.6) are all positive and decreasing in r’ 
and n. Therefore, if G(r) is estimated by cutting off this series, the estimate will 
be too small, by a percentage which decreases as r? and n increase. 

The & that minimizes the maximum over r of the absolute difference between 
(2.6) and 1 + (1 — r)/2(n — hk) is, for large n, (—7 + 9 +/2)/2 = 2.87. 
This suggests the approximation 
G(r) _ | l-r 


(2.7) 


r T 2(n — 3)" 


This is accurate within .01 for n = 8, and within .001 for n = 18. 
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TABLE 1 


Ordinary bivariate correlation coefficient, n degrees of freedom 


- = 


ft 


-I 
uo 


Ol4 
1 


] 
O11 
O09 
1.008 
O06 
O06 
OO5 
005 
OO4 
O04 
OO4 
003 


] 
l 
] 
] 
A. 
] 
1 
. 
] 
] 
l 
l 
B: 
Ri 
1 
1 


By (2.2) and (2.3), G(r)/r is larger than 1 and decreasing in r° and n, as Table 
lb suggests. 

2.3 Partial correlation coefficient. We observe that an unbiased estimator of the 
partial correlation coefficient can be immediately obtained from the preceding 
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section. More precisely, suppose the columns of X:p X WN are independently 
distributed each as p-variate normal with mean vector p and covariance matrix 
~. We wish to give an unbiased estimator of the partial correlation coefficient 
pi2-(q---p) » Lhe usual estimator, ry.¢g...p) , has the density (2.1) with n = N — 
(p — q) if »w is known, and n = N — 1 — (p — q) if uw is unknown. Therefore 
G(ri2.¢q---p)) (with appropriate n) is the unique minimum variance unbiased 
estimator of py2.g-..p) and possesses the other properties of G(r). 


3. Intraclass correlation coefficient. Let the columns of XY: p X N be inde- 
pendently distributed, each as N(y, 2*), i.e., p-variate normal with mean vector 
» and covariance matrix =*. Suppose =* is of the form o°[(1 — p) J + pee’], where 
e’ = (1,---, 1), ie, oh = 0°, ot) = po’ (i ¥ 7), with p and o* unknown. The 
problem is to estimate p unbiasedly. 

We note that p is just the slope of the regression line of x2 on 2 , and is there- 
fore estimated unbiasedly by 


where a dot indicates an average over the omitted subscript. We will see pres- 
ently that there is a complete sufficient statistic (wu, v). 6 is not a function of 
‘u, v), nor is it confined to the range of p, namely, —1/(p — 1) to 1. However, by 
the Blackwell-Rao theorem, E(f | u, v) is the unique minimum variance un- 
biased estimator of p. Since E(f | u, v) is difficult to obtain, we shall use the joint 
distribution of u and v to obtain an unbiased estimator h(u, v) of p, which, by 
completeness, must equal E({ | u, v). 

As in the previous section, sufficiency and invariance suggest that we confine 
ourselves to functions of the conventional estimator r’ of p. However, it is easier 
to deal with the density of (u, v), and it will turn out that the unbiased estimator 
h(u, v) is a function H(r’) of r’ alone. 

3.1 Reduction to canonical form. Let A:p X p be an orthogonal matrix with 
first row pe’, and let Y = AN. Then the columns of Y are independently 
distributed, each as V(Ap, AD*A’). Now 


ve ’ = a1 0 
ara=() 2); 


where oj = o [1 + (p — 1)p], o> = o° (1 — p). Because of the particular diagonal 


form of the covariance matrix, the yie(i = 1,°--, pa = 1,---, N) are in- 
dependent, and if we let 7 = Au = Ey, then ya is N(m, 01), (a = 1,---,N) 
and yia is N(ni, 02), (@ = 2,---,p;a@ = 1,---,N). We can therefore obtain 


se 9 2 . i ; 
two sums of squares, u and v, sufficient for o; and o2 and distributed independently 
AS 01 Xa and oz x, Where the degrees of freedom a and b depend on our knowledge 
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of u. To write u and v conveniently, we first observe that 


i=l a=1 


p N Pp N 
Z >» Yio = tr YY’ = tr XX’ = 7. > zie, 


‘=1 a=1 
y 


D ie = 


a=1 


p 


Dye 


i=l 


Yi. 
Precisely, we consider the following three cases: 
(i) w# = Oand hence 7 = Ap = O. Leta = N,b 


N 
«5 
o=LVwR=ALY 


(ii) w completely unknown and hence 7 


Au 
Leta = N —1,b6 = (p— 1)(N —- 1), 


is also completely unknown. 


N 


XN 
om (Yia = Yr.) > 2. a oe i 


L.¢ Hue} 9 
a=1l a=l 


Pp N ‘ p 
= > > YVa—-Y-) = > > Lia ~ Tj. — La T TZ. 
i=2 a=1 a=l i=1 


(iil) w we, Where w is an unknown scalar, and hence 
w VY p(l,0,--- ,0)’. Leta =N —1,b 


7 


= (p — 1)N, 


N v 
vaste 7 (Ya — yn.) = Pp 2. (Tie — 


a=) a=l 


p 


P Nv N 
yl Es RP ond 


i=2 a=l i=1 a=1 


In each case u/oi, and v/oz are independently distributed as xz and x, 
i a . ° e at 2 2 
and it is easily shown that (u, v) is a complete sufficient statistic for (ci , a2). 
The three cases can thus be treated simultaneously. 


3.2 Derivation of the unbiased estimator. The condition that h(u, v) be 


2 .w@ 

( a/2—1 b/2—-1 —Ou—@v 
I | h(u, v)u 6 ** du dv 
0 0 


unbiased 
is 


(3.1) 


b @—86 or 
2/¢+ (p — 1)0 
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where 6 1/(207), @ = 1/(20%). The right-hand side is the bivariate Laplace 
transform of 


b 


| al a Pe 
( _ 1) | lu — (p — 1)y/}? ‘o~ y)> dy 


32) ~ : 
al : ey, Powe 
-~ 1) | waw- (p — | t}° (y — t)? dt, 


0 


4 


vj, [4, p. 36 (Satz 12), p. 208 (9), p. 236 (87)]. 


where L = min [u/(p — 1), 
by parts and letting z = u/[(p — 1)v], we 


Integrating the first term of (3.2) 


obtain 


b a 
1 


h(u,v) = h*(z) - zw)? (1 — w)? 


for ze2l, 


(2, 2.12 (1). Integrating the second term of (3.2) by parts we obtain the following 


alternative to (3.4): 


I 
h*(z) (¢ 


yo 


3.9) 


(2, 2.12 (1)]. 
The conventional estimate of p is (e.g., [5] and [6]), 
(Lia ns hi) (Lja _ h;) 
p ~ - F 


P -% 7 z (Zia — pi)” 


im (=. - 
p-l\ut+v 


where 4 is the appropriate estimate of yu in (i), (ii), or (ili). Now 


io en ST es 


(3.7 2/1 7"\? 
- wo FG ~ 


which is a strictly increasing function of r’. Thus A*(z) is a function H(r’) of r’. 
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TABLE 2 
Intraclass correlation coefficient, bivariate case, n degrees of freedom 
H(r’) =r’F (1/2, 1; n/2;1 — r’) 


2a. Table of H(r’) 


> 


AT 
.931 


.923 


ort 


918 
915 
913 

911 
G10 
909 


a) ie eS) 


SUNN NAS 


9OS 
907 
.907 
906 
OG 


( 
; 


fee fk feet ttf 


1 l 
L ] 
RS ] 
1 ] 
1 l 
] 1 
I ] 
l 1 
Bs ] 
] ] 
1 l 
1. l 
BR: 1 
1 ] 
1 ] 
1 1 


3.3. Properties of the unbiased estimator. For p = 1,z = *”,r’ = 1 with prob- 
ability 1, and h*(2) = H(1) = 1;forp = —1/(p— 1),z=0,r = —I1/(p — 1) 
with probability 1, and h*(0) = H(—1/(p — 1)) = —1/(p — 1). Thus in the 
two cases when * is singular, h*(z) = H(r’) = p with probability 1. Further- 
more, h*(z) is a strictly increasing function of z, since the integrand of (3.3) for 
0 = z <= 1 and of (3.5) for z = 1 is strictly monotone for each value 
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of w,0 < w < 1. Consequently, H(r’) is a strictly increasing function of r’ and 
—1/(p — 1) S h*(z) = H(r’) = 1, which is the range of p. 

As remarked before, h*(z) = H(r’) is the unique minimum variance unbiased 
estimator of p. 

We will now obtain the asymptotic distribution of h*(z). Note that z is dis- 
tributed as 


1+@-We ap, 
i—~* b(p — 1) 


and that ~/(p — 1)N/p (F.» —1) is asymptotically N(0, 1). Therefore, letting 
zo = [1 + (p — 1)pl/[(p — 1)* (1 — p—)], the quantity, 


/(@ = N E (@-W'd~)_,]_ ,/N@-D"U- 2), 
4 Pp ~ 1+ (p— lp V p 1l1+(p—ljp ~ 
is asymptotically N(0, 1). But, by (3.3), denoting N~** by ¢, we have, for z < 


N f* \N/2 
P | fl—-w-—(p- 1)zw}*”* dw 


1 — h*(z) = — = 
, p—12% 


- {1 + O(2))*"1 + Of€2] + NOU — 


ili ieatnis aici (+) 
p—11+(p—1)2- N/’ 
uniformly in z. We obtain the same result for z = 1 from (3.4). Therefore 
h*(z) = p+ (p — 1) (1 — p)’ (2 — 2)/p + O,(1/N). 
Therefore 
V/N [h*(z) — p] = WN [H(r’) — p] is asymptotically N(0, 0°), 


where o° = (1 — p)’ [1 + (p — 1)p|'/[p(p — 1)]. 
Expanding r’ about 2 in (3.6) we find 


r’ = h*(z) + O,(1/N) = H(r’) + O,(1/N), 


so that r’ is asymptotically equivalent to H(r’). Incidentally, we find that 
/N(r’ — p) is asymptotically N(0, 0”), with the same o’. 

In order to facilitate the use of the unbiased estimator in the bivariate case 
with n degrees of freedom, i.e., case (i) or (ii) with p = 2, Table 2 gives H(r’) 
and, (for easier interpolation), H(r’)/r’ for r’ = 0(.1)1 and a = b = n = 2(2) 30. 
In this case, H(r’) = H,,(r’) is an odd function of r’. The computation was carried 
out by means of the recursive relation 


: n—2 r’ ae 
H,(r’) = ; s- 7 3 Hast’) |, 
n—3L1l—r’ 1-—r” 


together with the initial conditions 





H.(r’) = :: 
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out separately, and the result agrees with the final form of h*(z) in (3.4) and (3.6). 
The recursive relation is obtained by application of the relations [2, 2.8 (36) and 
(39)]. The same formulas give recursive relations for any values of p, a, b. 

For the bivariate case with n degrees of freedom, 


a H(r’) = r’F(3, Ie n a 1 - yp’). 


H,,(r’) was derived for n = 3. For n = 2, the inversion in (3.1) must be carried 


which is obtained from [2, 2.8 (36) and 2.11 (34)}. 
Approximations for H(r’) can be obtained from the expansion (2.2) applied 

to (3.8), which gives 
a H(r’) i-—r* 31 — rr) a 
(3.9 ——- = 1 + —— + ———_— + O(n ). 
} r n n(n + 2) 


This gives H(r’)/r’ within .01 for n 2 19 or .001 for n = 57 if two terms are 

included, and within .01 for n = 12 or .001 for n = 26 if three terms are included. 

As in (2.6), the neglected terms in (3.9) are all positive and decreasing in r” and n. 
The k that minimizes the maximum over r of the absolute difference between 

H(r’)/r’ and 1 + (1 — r”)/(n — k) is, for large n, 6(—1 + 4/2) = 2.48. This 

suggests the approximation 

(3.10) me whe oe 


r n — 5/2° 


This is accurate within .01 for n = 10 or .001 for n = 26. 


4. Multiple correlation coefficient. Suppose we have N independent observa- 
tions on a p + 1-variate normal distribution with mean vector uw and covariance 
matrix 2, and we wish to give an unbiased estimator of the squared multiple 
correlation 


9 


= po .c12---p) = 1— R/Row, 


where & is the determinant of the correlation matrix and Moo is its first cofactor. 
We are concerned with the cases (i) « known, = unknown, and (ii) all parameters 
unknown. 

As in 2.1, we confine ourselves to functions of 

9° 9 

r= T0.(12---p) = 1 — R/Rwo, 
where F& is the determinant of the appropriate (to (i) or (ii)) sample corre- 
lation matrix and Rw is its first cofactor. 

ry 2. 2 m . 

The condition that J(r°) be unbiased is 


ofn 
n I"{- k 2k 
(3 * ) Olas p—2) /2)+k 2) (n—p—1) /2 72 
ba 9 I(r°)(r°) (1 —r’) dr 
! Jo 


k=0 r ( 4 2) 
= ( 5 2) r (*) (1 os Ps 2 
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where n = N and (N — 1) in cases (i) and (ii). Following the methods of Sec. 2, 
we obtain 


‘ — —~e+9 
I(r’) aj-- a - rr (1,1;2=2 1-7). 
n—p 2 


As usual, /(r°) is strictly increasing in r’, and differs from it only by terms of 
order 1/N, and it is the unique minimum variance unbiased estimator of p’. 
Also J(1) = 1. However, /(0) = — p/(n — p — 2). We cannot hope for a non- 
negative unbiased estimator, since there is no region in the sample space having 
zero probability for p» = 0 and positive probability for p > 0. For the same 
reason there can be no positive unbiased estimator of p either. 
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TRANSIENT ATOMIC MARKOV CHAINS WITH A DENUMERABLE 
NUMBER OF STATES! 


By Leo BrREIMAN 
University of California 


1. Introduction. Many of the more interesting transient Markov chains have 
the property that for any set of states A and any initial distribution, the prob- 
ability of entering A infinitely often (i.0.) is either zero always or one always. 
This type of chain has been termed atomic by D. Blackwell [1] and is exemplified 
by the three-dimensional random walk or by the successive sums of independent, 
identically distributed random variables. 

In this paper we investigate the ‘fine structure’”’ of an atomic chain, that is, 
we try to characterize the class of all sets A such that P(z, ¢ A io.) = 0. The 
study is restricted to atomic chains with a countable set of states-which, for con 
venience of notation, we identify with the integers, and with stationary transi- 
tion probabilities p;}?. 

The martingale convergence theorem is used in [1] to show that a necessary 
and sufficient condition for atomicity is that every bounded solution ¢ of 


o(i) = Dd) pio(J) 
2 


be constant. We use as our main tool the semi-martingale convergence theorem 


and the corresponding equation ¢{7) = a p:3¢(j) and obtain a complete, but 
2 


not simple, characterization of the fine structure of transient atomic chains. 

To illustrate the use of the above characterization we prove two theorems 
regarding the return to equilibrium times xo, 2, --- in the coin-tossing game. 
The latter of these is then used to prove that there exists no set of numbers 
{Xm} such that? P(z, ¢ A io.) = 0¢ Te iia hea < @. 

This last result shows that, in general, there is no simple resolution to the ques- 
tion of defining the fine structure. There are, however, a number of interesting 
transient atomic chains which have the property that every infinite set of states 
is entered infinitely often with probability one. These chains are the subject of 
papers by Chung and Derman [2], and Breiman [3]. 


2. Use of the semi-martirgale theorem. 


THEOREM 1. Let 29, 2, -°-- be an atomic chain. Then for @ any nonnegative 
solution of 


(a) g(t) = » pis (9) 


1 This paper was prepared with the support of the Office of Ordnance Research, U.S. 
Army, under Contract DA-04-200-ORD-171. 
2 The referee has informed us that a similar theorem for the three-dimensional random 
walk has been proved by P. Erdés and B. J. Murdoch (unpublished). 
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which is finite for at least one value of 1, ¢(x,) converges almost surely (a.s.) to a con- 
stant independent of the initial distribution. 

Proor. Let ¢ be a nonnegative solution of (a) with ¢(0) < , and let R be 
the set of all states 7 such that P (entering 7 |r» = 0) > 0. From the atomicity 
and P(x, ¢ R i.o. | zo) = 0, where R is the complement of R, follows P(x, ¢ R 
i.o.) = 0. Therefore, it is sufficient to prove the theorem for initial distributions 
concentrated on R noting that ¢ is finite on R. Pick any such distribution {p;} 
such that >>; p6(j) < *% and p; > 0, all j ¢ R. The random variables $(z,) 
form a semi-martingale with respect to the fields generated by zo , 21, +--+ , since 


E($(rn) Tan y °°** ’ Xi ) E($(2,) | Ln-1) -_ i Dent (J) < (2, -1 ), 
E'¢(z,)| = Ed(x,) S&S Ed(to) < ~. 


By the semi-martingale theorem [4], ¢(z,) converges a.s. Suppose this limit is 
nonconstant, then there will be a number a > 0 such that if A is the set of states 
defined by {j; $4) 2 a}, then 0 < P(z, € A i.o.) < 1. Hence the limit is con- 
stant, and since ¢(z,) must converge to this same constant with the initial dis- 
tribution concentrated at any single state in R, the theorem is valid. 

We note that the same result is true for ¢ any bounded solution of (a) because 
for any sufficiently large constant a, ¢ + a@ is a positive solution. 

A simple but informative corollary of the above theorem demonstrates the 
special applicability of (a) to the transient case. 

Coro.uiary. All the states of an atomic chain are recurrent if and only if all 
hounded solutions of (a) are constant. For a transient atomic chain there is at least 
one nonconstant bounded solution of (a) 


Proor. Let ¢(7) = P (entering 7% | 2. = 7), so that $(%) = 1, and 


¢(t) = E(P(entering to | 2, 20) | to = 7) E(¢(a%1) | 2 = 1) = >; piel). 


If every solution of (a) is constant, then for every t we have P(entering i | zo = 
t) = 1. This implies that return to every state is certain. Now assume that every 
state is recurrent and let @ be any bounded solution of (a). If $(7) & ¢(), 
¢(z,) cannot possibly converge to a constant since both 7 and 7 are entered i.0. 
with probability one. If there are transient states present the function ¢ defined 
above cannot be constant for all 7% . 


3. Characterization of the fine structure. We use the notation 


ux = E(number of visits to k | x» = 1), 


e 


4. 4 j (n—-1 
mT Pa st Pik 


Ox = 
. ( ) 
and recall that ui, = lim, wii. 
THEOREM 2. Jf x, 21, °°: ts an atomic chain, then for every nonnegative se- 
quence of numbers {ax} with >>. unae < © and every « > 0, the set of states 
A, = {t; Do: unoy = e€} has the property P(x, € Aq i.o.) = 0. Conversely, every 





214 LEO BREIMAN 


set of states A such that P(x, € A i.o.) = 0 is included in at least one of the sets 
A, as defined above. 4 

Proor. Let {a,} be a sequence fulfilling the conditions of the theorem and let 
¢(i) = > ui.a,. The identity Di Ds Uj = Ux — 6% leads to the equation 
>; piso(j) = O(2) — a;. Thus, theorem 1 applies and ¢(z,) converges a.s. to a 
constant. Since the properties in which we are interested do not, in an atomic 
chain, depend on initial conditions, it is sufficient to take rz = 0. Then, iterat- 
ing the equation which ¢ satisfies, 


E(¢(x,) x = 2 |\ = Uok la. — 0 


and by a semi-martingale inequality ({4], p. 325) which states that 
E(a.s. limit) S E¢(r,) 


we are able to conclude that the a.s. limit of ¢(z,,) is identically zero. This implies 
that P(¢(x,) = €i.o.) = 0 and proves one part of the theorem. 

To get the second part, let A be any set of states with P(x, ¢ 
Form the function ¢(7) = P(entering A , 2» = 7), so that ¢(7) = 
It is easy to verify that ¢ satisfies (a), and thus ¢(z,) converges a.s. to some con- 
stant. We deduce that this constant is zero by noting that P(entering A after 
n — 1 steps) = E¢(z,). Since P(z, € A 1.0.) = 0 we conclude that E¢(r,) — 0 
and apply the bounded convergence theorem to get the result. Let the nonnega- 
tive sequence {a;} be defined by d(7) = a; + i pi@(j). Iterating this equation 


(n) 


oi) = Dpiro) + DL ul} 
? 


2 


By the boundedness of ¢ the second sum converges to >>; uija;. The first sum 
must also converge to some bounded limit sequence {A(z)}. Since 


A(t) = Di PisA(J), 
by Blackwell’s theorem as quoted above this sequence is constant, and by the 
convergence of ¢(z,) to zero, A(i) = 0. The set A is contained in the set Aq = 
is >: Uno, 2 1} which proves the theorem. 


4. Two theorems concerning the coin-tossing game. We apply theorem 2 to 
the Markov chain zo, 2; , --- whose values are the successive times of return 
to equilibrium in the fair coin-tossing game. The set of states is the set of all 
nonnegative even integers and we use the fact that this chain, being the sum of 
independent and identically distributed random variables, is atomic. It is well 
known that 

tn = G, i 

c via 

~ V(k rom i)? y wl. 

As it is evident that the characterization given in theorem 2 is invariant under 

asymptotic equivalence, we use 1/+/k — i throughout this section in place of 
us With the convention 7/0 = 1 and 1/+/— = 0. 
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The first theorem we prove is similar to a theorem stated by Chung and Erdés 


THEOREM 3. Let the sequence of even positive integers {m;} be such that the se- 
quence {A;} defined by A; = m, 


m;_; 1s nondecreasing. Then 


1 
Piz, €¢ {m} 10.) = 0 a 7a <<, 
; 7 Vm 
Proor. If > 1/Vm; < ~, the assertion follows immediately from the Borel- 
Cantelli lemma. Now assume that P(z, ¢ {m;} i.o.) = 0, but that > 1/Vm; = 


. . J; 
x. By theorem 2, there is a nonnegative sequence {a;} such that }>; a /Wk < 
2 and {m,} C {1; > a/Vk-i2 


e}. From this we have for all m, 


y 


- : a i 
kaon, VE — mm” 
Define \,*’ by 
N l 
(= 0 m: Vk — m 
_ ee 
= Um; 
It is evident that Ir hi a = 
We will show that \,° 


e, all N, and that limy ;°’ = 0 for k fixed. 

< c/V/k, all k, N, and conclude from the bounded con- 
vergence theorem the contradiction that limy a _ 
assume that k => my , then 


a 


0. To begin with, 
N 


i ] 
V My Zz < 
pm. (mw) 
Ve r “ 


1=0 
k nmi 


IA 








Vm; Vm 


oo my, 
a a 


1 


i=0 /m; 
By splitting the top sum into the two parts m; S my/2, m; > my/2 and using 


/ 
our assumption concerning A;, we get 1 bY? < 4. Now if k < myx, let Ma 
be the largest of the m; which is <k. With this 


N 


Vato 


Vim? s = 


Vm;Vm, — Mm; 
< Xj 


i=0 \V Mm; 
. ° ° (\N 
and repeating the above argument results again in ~/k\,"? < 4. 
It is clear that in the above context, a little more attention to the appropriate 
inequalities would result in a considerable weakening of the growth condition on 
the sequence {m,}. 


We can get a result in another direction by combining our characterization 
with different inequalities. Let all the states between and including n,; and no, 
N2 = n,, be called an interval and denoted by [n; , no]. 
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Turonem 4. If the sequence of disjoint finite intervals {I;}, I; = |m,, Mj] 
ts such that for some 6> 0, mji, 2 (1 + 5)M;, then, denoting 


l; = M;- m; + 2, 
Plan € U Lio) =0e + V1;/M; < @~. 


, 
Proor. Define a sequence of intervals I; = [m;, Mj] by m; = M;, M; = 
fz . 
M;+ ~vl;, where ~/1; is here to be iniaemeiel as the greatest even integer 


less than \/U;. Let >>; V1;/M; < ~ and define a by 
1. ifkeul;, 
j 


a } ‘ 
\0 otherwise. 
By these definitions 
ae 1 
“. 
Vk = 3 


DX Vii/M; < @. 
? 
Thus the set A = {i; )o,ex/+/k-i = 3} has the property that P(z, ¢ A i. 0.) 
= (). The set A includes, in particular, the integers 7 such thati < /; and 


1 —— pm 
< D> 5 UVM 1 -— Vm — 7). 


i= 
| 
| 
| 


This inequality can be easily shown to be satisfied by all i = m;, which 
proves the theorem going one way. 

To go the other way, assume that P(x, ¢ U; I; i. 0.) = 0. Then there is a non- 
negative sequence a; such that Dover/VWk <0 andU,1;C fi; Dope /Vk-i = €, 
from which, if ie 7;, then >. >m,; a /Wk-i 2 ¢. We wish to conclude that part 
of this sum is eT and argue that if 7 e J; ,and if 7 is sufficiently large 


_ ae « s/ it. _ =< «t 
a WV E-t VE? V i — Mice, VE Ww? 


so that if7eJ;, 





We sum this last inequality over 7 e J; to get 


Saas (= Je :) = 


mii >k>m; tel; 





It can be easily shown that 





and using this we conclude that 


UE Te = ig 2 VisMi. 
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5. The nonsimplicity of the fine structure of the coin-tossing game. The pur- 
pose of this section is to prove the following theorem. 

THEOREM 5. Let xo, 1, +--+ be the successive times of return to equilibrium in 
the fair coin-tossing game. Then there exists no weighting {Xm}, \m 2 0 of the posi- 
tive even integers such that 


P(r, ce Aio) = 0 DAn < &. 
meA 
Proor. Consider any set U,/; where the J; are disjoint finite intervals which 
we can represent as [m;, m;(1 + a;)], 0 < a; S 1, where mj,; = 3m;. By 
theorem 4 


P(x, € UI;io0.) =0e >> Vaj < «. 
? 2 


Let now {A,,} be any weighting of the positive even integers having the 
property stated in the theorem. By this property, lim, 4, = 0 since otherwise 
we could find an indefinitely sparse set A which would be entered i.o. with prob- 
ability one. We define a function ¢(a),0 S a < @ by 


net 


¢(a) = lim inf > An, 


where in writing the upper limit of summation as ne* it is immaterial whether we 
take the next greater integer, or the previous integer. 

PROPOSITION. ¢(a) ts monotone nondecreasing, ¢(a + 8) = ¢(a) + (8) 
and there is a neighborhood of the origin in which ¢(a) < @~. 

Proor. The first assertion is immediate. As to the second, we write 


neth nee newb 
lim int ( Zz d -) = lim int (¥ Am + 7 n) 


m=n m=n m=ne@ 


= lim inf (3 > d -) + lim int (¥ a). 


m=n mn 


Finally, suppose that ¢(a) = « for all a > 0, and consider any sequence 
fa;},0 < a; S 1, such that oF Va; < &. Since lim, [A°7"” \,, = © for 
all j, we find a sequence of intervals J; = [m;, m;(1 + a;)] as far apart as de- 
sired having the undesirable property 5 eat An = ©. 

To complete the proof of the theorem, we note that as a well-known conse- 
quence of the proposition there is a neighborhood N of the origin and a constant 
q < ~» such that ¢(a) S ga, a ¢ N. Take {a;}, a; > 0, such that Zu ajc a 
but >.; Va; = », and {a;} C N. Then we may find a sequence {m;} increas- 
ing as rapidly as desired such that 


mj; (1+a;) 


z. Aw SS 2Qa;. 


mM 5 
Hence, taking J; = [m; , m,(1 + a;)] we have 


.: V/a; = @ but 2» Z Den 


mel; 
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It is a pleasure to acknowledge my debt to David Blackwell who brought 
my attention to the problems treated above. 
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A CONSTRUCTION FOR ROOM’S SQUARES AND AN APPLICATION IN 
EXPERIMENTAL DESIGN 


By J. W. ArcHBOLD AND N. L. JoHNSON 
University College, London 


1. T. G. Room [1] recently proposed the following problem: To arrange the 
n(2n — 1) symbols rs (which is the same as sr) formed from all pairs of 2n differ- 
ent digits in a square of 2n — 1 rows and columns such that in each row and 
column there appear n symbols (and n — 1 blanks) which among them contain 
all 2n digits. 

He remarked that the problem is soluble when n = 1 (trivially) and n = 4 but 
not when n = 2 or 3; and he gave one solution for n = 4. 

Squares of such a type have uses in experimental designs. We explain below a 
simple construction for squares where n has the form 2°". Each square con- 
structed in this way is represented in a canonical form by applying a well-known 
theorem of J. Singer [2]. In this form as soon as the top row of entries in a square 
is known, all the other entries may be written down immediately by means of a 
straight-forward cyclic process. Thus an index of first rows is all that is necessary 
to catalogue squares in their canonical forms. 

It may be permissible to give here a slight modification of the proof of Singer’s 
theorem in order to show a natural application of the regular representation of 
linear algebras. 


2. Let @ be a linear associative algebra, of order m and with modulus, over a 
commutative field K. It is well known that @ is isomorphic with an algebra of 
m X m matrices whose elements belong to K (c.f. Macduffee [3], Section 123). 

A Galois field GF(p™") is such a linear algebra over a GF(p”). If the elements 
of the GF(p™") are 0, a, a’, --- ,a”””* = 1 the irreducible equation, of degree m 
and with coefficients in GF(p"), 


f(z) = 2" — az” — ---—an=0 
which is satisfied by a@ is called primitive (Dickson [4], Section 35). A basis for 
. 2 —]l . 
the algebra consists of 1, a, a,---,a@” and the modulus is 1. 


The primitive equation is both the minimum and characteristic equation of 
the companion matrix 


0 
| 
Om, AOm—1 Am—2 


Received December 20, 1956; revised October 16, 1957. 
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The correspondence a’ <> A’ determines an isomorphy, or regular representa- 
tion, of the GF(p”") on the algebra or Galois field whose elements are the m X m 
matrices 0, A, A’, --- , A?”’-! = I, where I is the unit matrix (c.f. Macduffee 
[3], Section 109). If N = 1 + p™ + --- + p*™”" then the matrices A’, for 
j = 1,--:.p" — 1, are the multiples of I by the elements of GF(p”) and form 
a sub-algebra, of matrices, isomorphic with GF(p"). 

In a finite projective space PG(m — 1, p”) over the GF(p"), let x and y denote 
column coordinate vectors. Then the equation ky = Ax, where k is any non-zero 
element of GF(p"), determines a homography in the space of period N. This is 
Singer’s theorem; and the proof differs from his more in form than substance 
It is significant for us that N is also the number of points in the space. 


3. Confine attention now to the case where p = 2 and n 1. The space, a 
PG (m — 1, 2), contains n, = 2" — 1, points, with three on every line. 
The following are primitive irreducible polynomials over GF'(2): 


(x + 1), x —(r+1), x —(r#+1), 
6 


(x* + 1), z® — (x + 1), z’ — (x + 1) 
(e?+e°4+2°+4 1), x? — (2° + ote’ + 27° + 1). 


This list is taken from Dickson ({4], p. 44); it is not exhaustive for the degrees 
mentioned but for each degree the second largest exponent of x is as small as 
possible. 

For a given m, choose any appropriate primitive polynomial and consider the 
associated homography of ?G (m — 1, 2) of period yw. If P; is any point of the 
space, let its successive transforms under the homography be P2, P3,--- , 
Py At pis = Py). 

Now consider the space PG (m — 1, 2) as being a prime in a PG (m, 2). To 
achieve this, suppose x; , --- , X, are coordinate vectors for P;,--- , P,. Then 
coordinate vectors for all but one of the points in PG(m, 2) are obtained by 
adding a further zero or unit coordinate at the end of each x; ; and the last point 
by taking coordinates consisting of m zeros followed by 1. Denote this last point 
by Qo and let Q; be the third point on the line Qo/’; ; Q@; and P,; have the same first 
m coordinates. 

To fix ideas, take m = 3 and f(x) = 2° — x — 1. Thenyw = 7 and the corre- 
sponding homography is 


Yor Yr Yo = % 2X2: 
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Starting with 2» = 1,2; = x2 = 0, we obtain for PG (3, 2) the following points: 
P, P, P3 P, Ps P, P; 
/1 0 1) 0 (1 (1 
1 0 1 
0 1 0 
0 0/ 
Q QQ Q Q Q 


0 1 0 ) 1 
0 0 0 ] 0 
0 0 1 0 1 


] 1 \l \1 \l 


4. The idea is now to rename the points Q,, --- , Q@, as R,, --- , R, in some 
order to be determined with the object, when possible, of ensuring that whenever 
the line Q,Q; passes through a point P, then the line R,R; passes through a 
different point P, . 

The various incidences are then registered in a table of u« rows and yw columns 
as follows: if the line Q,Q; passes through P, and R,R; passes through P, make 
the entry 


ij or 4,0 
in the place belonging to the rth row and sth column of the table. 

The number of entries in each row and column is the number of lines through 
a point of PG(m, 2) which do not lie in a prime through the point. This number 
is (2" — 1) — (2™* — 1) = 2™”. And the entries in every row and column are 
all the integers 0, 1, 2, 3, --- , 2” — 1 taken in pairs. No two pairs are the same 
and there are 2”*(2™ — 1) entries altogether. 

In the cases examined below, the desired objective is reached when m is odd 
by defining R,; to be Q. , where u = 2” — ¢t; and then no position in the incidence 
table contains more than one entry of the form (7, 7). When m is even, the same 
definition is used for R; but this leads to two entries in each position in the south- 


TABLE 1 
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west to north-east diagonal of the table: no better definition for R, has been 
devised which will prevent two entries from occurring in the same position. 


5. For the case m = 3, which we began to consider in Section 3, let us define 
R; to be Qs_; (¢ = 1, --- , 7). We then obtain the incidences shown in Table 1. 

It will be noticed that, beginning with the second, each row or column is 
obtained by a cyclic change in the positions, and values modulo 7, of the entries 
in the preceding row or column: that is, if X,, , Y,,. are the entries in row r and 
column s and X,, ~ 0, then, modulo 7, 


Xv.e = 1 T Xe-1,041 ’ Feu o 1 + | Seer . 


The whole table is therefore completely determined by the entries in any one 
row or column. 


6. For m = 4, we have » = 15 and we take f(z) = 2‘ — z — 1. R, is now 
defined to be Qys_; for i = 1, --- , 15. Table 2 is obtained. 

Here the NE-SW diagonal is shared by two sets of entries. This is a character- 
istic feature arising when m is even but not when m is odd. 

In fact, going now to the simplest case where m = 2 and f(z) = 2° — x — 1, 
the table which arises is as follows: 


For m = 5,u = 31. Take f(z) = 2° — 2” — 1. Define R; to be Qu_; . Then we 


obtain Table 3 (only the first line of entries need be given). 


TABLE 3 


a ae 7 





12, 20 8,23 | 9,21 |14,15 10,17 











16,25 3,6 
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7. If the columns of the first design of Section 5 be regarded as blocks, the 
rows as a set of treatments a; ,--- , a7, and the numbers in the squares as a 
second set of treatments bo , --- b; , then the design is an incomplete block with 
respect to the first set of treatments and a complete randomized block with 
respect to the second set of treatments. The design is also balanced with respect 
to combinations of different levels of treatment a, with different levels of treat- 
ment 6. The usual parametric model (Model I) would be 


Li; —_ A 1 B, + a; + B; ~ ftij 


(where x,;; denotes the observation on treatment combination a,b, in the ‘th 
block, °B, = Sia; = 358; = 0 and the z,;’s are mutually independent ran- 
dom variables with common variance and mean zero). The analysis of variance 
appropriate to this model is obtained as follows: 

(i) Carry out the standard incomplete block analysis on the means Z,, of 
pairs of observations for treatments a; in the same (fth) block. Multiply the 
resultant sums of squares by two. This will give the Between Blocks and Ad- 
justed Between Treatments a sums of squares in the final table. 

(ii) Compute the Between Treatments b sum of squares in the usual way 
(that is, 7 Dd jeo (Z..; — z...)*). 

(iii) Compute the Residual sum of squares as Residual in (i) 
>» >» i (tiuij — €,;.) — Between Treatment sum of squares in (il). 

The degrees of freedom appropriate to these sums of squares are then 


Blocks. . 

Adjusted Treatments a 
Treatments b 

Residual 


One advantage of this design lies in the fact that the treatment b sum of squares 
is orthogonal to the treatment a sum of squares. It is, unfortunately, not possible 
to test for interaction between the two sets of treatments. Certain specific inter- 
actions may, however, be isolated from the Residual sum of squares. For example 
the contrast be vs. by in the presence of a; can be compared with the average 
effect of the same contrast in the presence of asa; --- a; , provided it is assumed 
that other interactions between a and b are negligible. The calculation of the 
sum of squares for such a contrast could be based on a two-way table with 
entries 


boa, ’ bya, be : G;, b, a a; 


1=2 i=? 


in the usual way. 

Alternatively, the design may be regarded as an incomplete block design for 
treatments a, with main plots split for treatment b. In this case the design should 
be regarded as an incomplete block design also with respect to treatments b. The 
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model becomes 


Leaij = A 7 B, 7, oT B; TT Ue TT 2; 


where the u,,’s are independent random variables, with zero mean and common 
variance, which are also independent of the z,;;’s. The two incomplete block 
analyses may be carried out separately (except that the Blocks sum of squares 
in the Treatments b analysis is the Total sum of squares in the Treatments a 
analysis). The sums of squares in the complete analysis, and their associated 
degrees of freedom, are 


Blocks 
Adjusted Treatments a As in the original 
Error (1) : analysis (i) 


Adjusted Treatments 6 
Error (ii) 


As in the earlier analysis it is not possible, in general, to test for interaction 
between a and b, but certain specific interactions can be isolated from Error (ii 

Similar considerations apply to the second design of Section 7. 

The design shown in paragraph 8 is a supplemented incomplete block design 
in the sense of [5]) with respect to treatment a. The analysis of the design will, 
however, be similar to that described above for the designs of Section 7, and in 
particular the Treatment b sum of squares will again be orthogonal to the ad- 
justed Treatment a sum of squares. 
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A MULTIVARIATE TCHEBYCHEFF INEQUALITY’! 
By INGRAM OLKIN? AND JoHN W. Pratr® 
University of Chicago 


0. Abstract. A multivariate Tchebycheff inequality is given, in terms of the 
covariances of the random variables in question, and it is shown that the in- 
equality is sharp, i.e., the bound given can be achieved. This bound is obtained 
from the solution of a certain matrix equation and cannot be computed easily 
in general. Some properties of the solution are given, and the bound is given 
explicitly for some special cases. A less sharp but easily computed and useful 
bound is also given. 


1. Introduction and outline. Tchebycheff’s inequality states that if y is any 
real random variable with mean 0 and variance o?, then 


(1.1) P(\y| = ko) S 1/F. 


Berge [1] has generalized this result as follows. If y; and yz are any real ran- 
dom variables with means 0, variances oj and o3 respectively, and correlation 
p, then 

l++J/i—-p 
(1.2) P(|y:| 2 kor or |ye| = koe) S wi = 


sills ke 
Berge gives an example where the inequality is achieved. 

Suppose y = (y1,-°-*, Yp) is a random vector with mean 0 and nonsingular 
covariance matrix =. We seek an upper bound, depending on = and f,, --- 
k,, for P(\y:;| 2 kio; for some 7). 

The problem can be reduced by letting 2; = y;/(kio;). Then x = (2; ,--- , 2p) 
has mean 0 and covariance matrix Il = K ‘RK~’, where R = (p,;) is the cor- 
relation matrix of y (and of x), Ty; = o;;/(ei0;kikj;) = pi;/(kik;), and K isa 
diagonal matrix with diagonal elements k, , --- , kp. Furthermore, |y;| 2 ko; if 
and only if |z,| = 1, so P(\y:| 2 kio; for some 7) = P(\z,;| = 1 for some 7). 

Suppose A is a p X p matrix such that 


> 


(1.3) zAz’=>1 if |x| = 1 for somei. 


Then, looking at scalar multiples of z, we see that 


(1.4) zAz’ =>1 if |z,) 21 for somez, 
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and that 


(1.5) zAz’ = 0 forall z, 


i.e., A is positive definite. Therefore 
Lemma 1.1. Jf A satisfies (1.3), then 


(1.6) P(ly:! 2 ki; for some 1) = P(\x;) 2 1 for some i) S E(xAr’) 


where tr denotes trace. 


Each A satisfying (1.3) therefore gives an upper bound for 
P(\z,;| = 1 for some 1). 


The smallest bound obtainable in this way is the minimum of tr ATI over all A 
satisfying (1.3). The set @ of all such matrices A is obviously convex, closed, 
and bounded from below, and tr ATI is linear in A, so this minimum is achieved 
at an extreme point of @. In Theorem 3.3 it is shown that A is an extreme 
point of @ if and only if A™ is positive definite and has 1’s on the main diag- 
onal. Furthermore, there is a unique extreme point of @ minimizing tr ATI, 
namely that extreme point A such that AIIA is diagonal (Theorem 3.5). The 
bound thus obtained is the best possible, inasmuch as, if it is less than 1, there 
is a distribution for x (with mean 0 and covariance matrix II) under which it 
is achieved, and otherwise there is a distribution for z under which 


P(\z,| = 1 for some 7) = 1 


(Theorem 3.7). 


The minimizing matrix is easy to compute explicitly only in some special 


al 

that A satisfies (1.3), and minimizes tr AII with respect to a. Following this 
lead, in Sec. 2 we let A = [(1 — a)J + ae’e]’, where e = (1, --- , 1), show 
that A satisfies (1.3) for 1 > a > —1/(p — 1), and minimize tr AII with re- 
spect to a, obtaining the bound in Theorem 2.3. Though the minimum over 
such A is in general, except in the case p = 2, not the minimum over all A 
satisfying (1.3), it provides a useful and easily computed bound. Lal [3] con- 
siders a matrix similar in form to that of Sec. 2. However, this does not lead 
to the best bound, as Lal asserts, and indeed his bound is not as tight as that 
given in Theorem 2.3 unless p = 2 or p,;; = 0 for all 7 # 7. 


1 —1 
cases (Sec. 5). In the case p = 2, k;} = ky = k, Berge lets A = ( ‘) , shows 


2. A multivariate inequality. We will now carry out the program of the last 
paragraph. 


Lemma 2.1. A = [(1 — a)J + ae’e]’ satisfies (1.3) if 1 > a > —1/(p — 1). 
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Proor. A = [(1 — a)I + ae’e!' = (I — ae’e)/(1 — a), wher 
=a/l+(p— lal. afl — ae’ejxr’ = > xi — a(D> 2,)” 
zi if OZ[az=—I/(p—1), ie, 
e 
(1 — pa) zi if OSa<il, ie, 0s a<1/p 
(The second case follows from (>, x;)’ S p> xi .) The right-hand side becomes 
infinite with }> zi, so the minimum over all (p — 1) — vectors z of 
(1, 2z)(I — ae’e)(1, z)’ 


occurs at a finite z. Differentiating 


(L.2d — aee}(l, 2)’ = 1 +>; Zi — a(l + > z:)° 
with respect to each z; we find that the minimizing z must satisfy 2z; — 2a(l + 
> 23) = 0 for all z,or z — aze’e — ae = 0. (Here e has p — 1 coordinates.) 
It follows that all z; are equal, and that }> 2; = (p — 1)a, soz = ae. There- 
fore the minimum over z of (1, z)(J — ae’e)(1, z)’ is 1 — a, and thus the mini- 
mum over z of 


(¥. ACL 2)’ 
is 1. The lemma follows. (See also Lemma 5.1.) | (This symbol will be used to 


indicate the end of a proof.) 


Lemma 2.2. tr [(1 — a)I + ae’e} Il is minimized for 1 > a > —1/(p — 1) 
by 


_ t= Vip = We =D) 
u—-(p— lt 
where t = trl = >> My; = ¥% 1/ki and u = elle’ = D0M,; = D> pij/(kik,). 


Proor. tr [(1 — a)J + ae’e] “Tl = tr (J — ae’e)M/(1 — a) = (t — au)/(1 
The derivative of this quantity with respect to a has zeros at 


(Zk) a 


> 


= 5. 


(a tt Vulpt — w)/( — 1) 
u— (p— l)dt ‘ 


The condition 1 > a > —1/(p — 1) is satisfied if and only if 
FV ut — W/@ = 1) 
is between u/(p — 1) and (pt — wu). The upper sign is impossible because 
u/(p — 1) and (pt — u) 


are both positive. The lower sign is possible because ~/u(pt — u)/(p — 1) is 
the geometric mean of u/(p — 1) and (pt — u). The extremum is a minimum 
since (¢ — au)/(l1 — a) > ~ asa—1ora-— —1/(p — 1). 

Substituting (2.1) in (1.6) and simplifying, we obtain, by Lemmas 1.1, 2.1, 
and 2.2, 
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THEOREM 2.3. P( y, 2 ko; for some i) = P(x;! = 1 for some 7) 


p-2 ,2.——— 
eT a 


1) 


/ 


iv 0 tT V (pt — u(p — II Pp - 
In the case p = 2, we obtain 
PCy) 2 kio or | y2| 2 keo)S Ok? ke [hej v ks > V (ke + k3)° ~ 4p'k? kil, 


which is Lal’s equation (B), and is to be compared with Berge’s result, (1.2). 


3. The sharpest inequality. In this section we seek the tightest bound ob- 
tainable from Lemma 1.1, and show that it is sharp, following the outline in the 
next-to-last paragraph of Sec. 1. What we seek, then, is the minimum of tr ATI 
for A satisfying (1.3), i.e., for A ¢ @. As remarked before, the minimum occurs 
at an extreme point of @. We start by characterizing, in Lemma 3.2, the mat- 
rices in @, and, in Theorem 3.3, the extreme points of @. We use the following 
lemma, which has some independent interest. 

Lema 3.1. Jf A is positive definite, the minimum of xAx’ for x, = 1 is 1/by 
and occurs at (1, b/ by), and only there, where 


bi ob 
B= 
bh’ Bes 


Proor. It is easily checked that 
a aAya’'), b= —byaAx , ns 22 = Ana'byaAn . 
“Completing the square,” we have 
(1, z)A(1, z)’ = ay, + 2az’ + zAnz’ 


a a -1 
= a, — aAna’ + (z + aA \An(z + aAg )’ 


= bi + (2 — biv'b)An(z — bir)’. 


Since Aw is positive definite, the lemma follows. Alternatively, (1, z)A(1, z)’ 
could be differentiated with respect to each coordinate of z, as in the proof of 
Lemma 2.1. 


It follows from this lemma and (1.5) that 

Lemma 3.2. A ¢@ if and only if B = A™ is positive definite and b;; < 1, 
gm d,-<+ 5p. 

TuHEorEM 3.3. A is extreme in @ if and only if B = A™ is positive definite and 
bs = 1,¢=1,---,p. 

Proor. (i) Suppose B is positive definite and all b;; = 1. Then, by Lemma 
3.2, A « @. Suppose A = (A; + A2)/2, Ay € @, Ape @. For each 7, by Lemma 
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3.1, 


1 = 1/b;; = min zAz’ = 3 [min 2A, 2’ + min 2A; 2’), 


zj=1 zj;=1 z;=1 


min 7A, 2’ = 1. min rAgz’ = 1. 


z=1 zj=1 


It follows that 


min 7A, 2’ = 1 = min zA22’, 
zj=1 z;=1 
and the minima occur at the same point. This implies, by Lemma 3.1, that the 
ith row of Ay’ equals the ith row of A2’. As this is true for each i, A; = Ae. 
Therefore A is extreme in @, which proves the “‘if’’. 
(ii) If B is not positive definite, A ¢ @, by Lemma 3.2. Suppose B is positive 
definite but b;; < 1 for some 7, say by < 1. Let 


b 
B(s) = B+ - ) - PZ . > 


By Lemma 3.2, B™'(5) ¢ @ for 6 small enough. If we can choose 6; # 6 such 
that B’'(4;) © @, B™ (6) ¢ @, and 


(3.1) A = B”* = 6B(6,) + (1 — 0)B™(&) 


for some 6,0 < 6 < 1, we will have shown that A is not extreme in @. 

According to the first sentence of the proof of Lemma 3.1, with A and B in- 
terchanged, B~’(6) is a linear function of its upper left element ay(4), so (3.1) is 
equivalent to 


ay = a,(0) = 6a};(6;) + ai- 6)ay;(52). 


Furthermore, 


1 1 ~~ an 


iting ter’ 64 tie 1 +d 


ay (6) = 


Therefore (3.1) is equivalent to 


65; (1 — 0) _ 9 
1+ 6, ay 1+ ian 
and it is clear that 6, and 6 can be chosen as desired. 

This reduces the problem to that of minimizing tr BI for B ¢ ®, where @ 
is the set of positive definite matrices with ones on the main diagonal. We will 
now show that tr B''II is minimized at a unique interior point B of @, (Theo- 
rem 3.4), and characterize B (Theorem 3.5). 

TueEoreM 3.4. tr BI is a strictly convex function of B for B ¢ ®, and has a 
unique minimum, which occurs at an interior point B of 8. 

Proor. Let B(t) be a straight line in ®. Then dB/dt is a symmetric matrix, 
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dB/dt’? = 0, and 


d sl 

-tr BUN = — tr B" (‘4) BUN, 

dt dt 

d 4 si = ‘ 
Ser t«t08 (7) B (= Bn > 0. 
dt? dt at 


This proves the strict convexity. The rest follows, since ® is convex 
and bounded, and tr B'IIl — « as B approaches the boundary of ®. The latter 
follows from the fact that 


tr BUN = (tr B')(smaliest eigenvalue of II). 


Tueorem 3.5. B is the unique point of ® such that BIB", or equivalently 
Brrr B, is diagonal. 


Proor. By Theorem 3.4, B is the unique point of ® for which 


d a ae Rete dB 1 Bs ae 
ab, trB Il =trB (7) Tl = tr (ir) BIBT = 2¢,; = 0 


for: * j7, where C = B 1B", and dB/db;; is a matrix with all elements zero 
except the (7, 7)-th and (j, 7)-th, which are one. 
We note that B'IIB™ = C if and only if 
B = W777 201 *) Meee! 2 — L2¢ql "mC" —.. 
By Theorems 3.3, 3.4, and 3.5, the tightest inequality obtainable from Lemma 
1.1 is 


THEOREM 3.6. P(\y;, 2 kyo, for some 1) = P(x; 2 1 for some 17) 
<trB'n = trsb'nB", 


where B is the unique positive definite matrix having ones on the main diagonal 
such that BI‘ B is diagonal. 

We note that trB'l = tr(B'NB")B = tr B'T1B", since BB" is 
liagonal and B has ones on the diagonal. 

According to the following theorem, the bound given in Theorem 3.6 is the 
smallest possible bound except when the smallest possible bound is the trivial 
bound 1. 

TureoreM 3.7. Let © = BNIB" and 6,,---, 6, be its diagonal elements. 
Then 


trB nl = trB ‘1B = tre = > 4. 
i> 6; < 1, equality holds in Theorem 3.6 1f and only if 


P(c = b') = P(e = —bd’) = 6,/2, 1=1,---,p, 


(3.2) 
P(x = 0) =1- D4, 
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where b', --- , b” are the rows of B. i. 6; > 1, P(jx;; 2 1 for some 7 


Pir =wv >, 9 b) = P(t = —y >, 9 7) 6 2 > 6), 


? ‘ ;2. 


Proor. If 7 6@, S 1, (3.2) is a distribution for z, and if z has this distribu- 
tion, equality holds in Theorem 3.6. If x has the distribution (3.3) and ¥> 6; > 1, 
then, with probability one, 2; 2 ¥ 4; > 1 for some 7. In either case, x has 
mean 0 and covariance matrix 


E(x'x) = >> 0b"b' = BoB = I. 
This proves the “‘if” 
It remains to prove the ‘‘only if’. Suppose 2. 6; S 1 and equality holds in 
Theorem 3.6. Then, by the relation of (1.6) to (1.4) and (1.5), with probability 
one, 


1 ; 


cB "- if xz; 21. for some 7, 


«By’ = 0 otherwise. 


It follows, by Lemma 3.1, that the distribution of z is concentrated at 0 and 
+bh',---, +b”. Then 


E(x) = 20 [P(e = b') — P(e = —b')Ib. 


But E(x) = 0 and b’, --- , b” are linearly independent, since they are the rows 
of a non-singular matrix, so P(x = b') = P(« = —b’) for all 7. Then 


E(x'z) = > 2P(a = b')b"b' = BDB, 
where D is a diagonal matrix with diagonal elements 
2P(z = b’), --- , 2P(z = 6"). 
But 
H, s D= 80s" = 
and (3.2) follows. 
4. On the solution of BOR = Il. From I = BOB, we find that 
Ns; = Da bieBada; ; 


and for 7 = 7 we have the system of equations 


i/ki = 2 5.6, 


If we write B & B = (6;;), then 


+, 0) = i, +--+, ke NB & BY”. 
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Thus given B and k,, --- , kp, we can solve for @ and II. The matrix B x B 
is the Hadamard product, and is positive definite if B is ({2], p. 143). Given 
ky, ++: ,k,, B results from some TI if and only if B ¢ @ and 


(ki, --- , kp )(B X By" 


has positive elements. The following example shows that this last condition is 
not automatically satisfied. 


mt a 9375 —A800 4800 
BD Tt Be, BX B\ (BX B) ={ — 4800 904 1596 
Ss S ki — 4800 1596 5904 


L « 


i 


Every B ¢ @ results from some /, , --- , /, and I, e.g., for 
| ae l,--- , DBX B. 


This section began with a procedure for determining Il from B by standard 
matrix operations. It appears that B cannot be obtained from TI by standard 
matrix operations except in special cases. We now give two properties of the 
solution (Theorems 4.1 and 4.2). 

THroreM 4.1. Jf P is a permutation matrix and PIP Il, then PBP 


Pre OOF. 


PBP)*(PBP) = PBI 'BP = Pe'P = © 


PBP ¢®&, so by the uniqueness in Theorem 3.5, PBP = B. 


- ; i, 0 . B, 0 ee 
THroreM 4.2. Jf II = e n,) then B = ” p,) ther B, minimizes 


tr BW, in Bi, i= 1,2 


Proor. If B,1T,'B, is diagonal, i = 1, 2, then BI'B is diagonal, and by the 
uniqueness of Bb, the conclusion follows. 
5. Special cases. 


Turorem 5.1. Jf 1 ~ has equal diagonal elements, say, d, then 


B = 1" /d, ee | 


P(ly; = ko; for some 1) = P( Za = 1 for some 1 + tr B Tl dp. 


This follows from Theorem 3.5. (The result for singular 


II is an easy conse 
quence of the result for non-singular IT.) 


. 1/2 . . ‘ 
We note that IT’~ has equal diagonal elements if the group of permutation 
matrices P such that PIP = II is transitive, i.e., every coordinate of x can be 


carried into every other one by a permutation of coordinates which preserves 
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the covariances, i.e., kh} = --- = k,, and every coordinate of y can be carried 
into every other by a permutation of coordinates which preserves the correla- 
tions. This follows from the fact that PI’’P = 11’? if PIP = P, since then 
(PIP) = PUP = 1. 


B = (1 —a)I + ae’e, i.e., the inequality of Sec. 2 is the best possible, if and 
only if the elements of IT are 
I L/ki, 


(5.1) a , > a(l — a) 9 
II, kk; = mete —U- 9 Sp], 
, " ; 1+ cl : 1+ (p —- 1)@ » r 


in which case 
(5.2) Py) = kw, for some 7) S tr BU" = z ke°/[1 + (p — 1a’). 

In the case p = 2, II is always of this form and (5.2) yields (2.6). 

eS k, = k, and 1; = 1/k*, 11; = p/k’, then II is of the form 
(5.1) and 


P(|y;| = ko; for some 7) S tr Bn 


(5.3) a p = [(p — 1) V1 a p+ % 1+ (p — 1)pl 
{1 + (p — 1)a?| pk? 
This could also be obtained from Theorem 2.3, or from Theorem 5.1. 
nt? = Vi-e,, [V¥i+ of — Dp- vi = ol 
k kp 


For special values of p and p we obtain in addition to Berge’s result (1.2), 
the following inequalities. 
(i) p= 1: PCy; = ko; for some 7) S 1/k°, which amounts to the univariate 


Tchebycheff inequality. 


a 


(ii) p = 0: For p uncorrelated random variables, 
PCy; = kic; for some 7) S > ks, 


whereas for p independent random variables, the univariate Tchebycheff in- 
equality yields the bound 1 — [[?.. (1 — #7’). 
(iii) p = —1/(p — 1): P(ly,| S ko; for some 7) S (p — 1)/k’. 
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ON SELECTING A SUBSET WHICH CONTAINS ALL POPULATIONS 
BETTER THAN A STANDARD 


‘ ao ™ 1 
By SHanti 8. Gupra AND MILTON SoBEL 


Bell Telephone Laboratories 


1. Summary. A procedure is given for selecting a subset such that the prob- 
ability that all the populations better than the standard are included in the sub- 
set is equal to or greater than a predetermined number P*. Section 3 deals with 
the problem of the location parameter for the normal distribution with known 
and unknown variance. Section 4 deals with the scale parameter problem for 
the normal distribution with known and unknown mean as well as the chi- 
square distribution. Section 5 deals with binomial distributions where the param- 
eter of interest is the probability of failure on a single trial. In each of the above 
cases the case of known standard and unknown standard are treated separately. 
Tables are available for some problems; in other problems transformations 
are used such that the given tables are again appropriate. 


2. Introduction. C. W. Dunnett [3] has considered a different but related prob- 
lem of comparing several treatment means with a control mean for normal dis- 
tributions with a common unknown variance. His goal is to separate those treat- 
ments which are better than the control from those that are worse (or not better). 
He controls the probability of selecting the standard as the best (i.e., classifying 
all other treatments as worse) when the treatments are all equal to (or worse than) 
the standard. Earlier, E. Paulson [8] considered the problem of selecting the best 
one of k categories when comparing k — 1 categories with a standard. He deals 
with means of normal distributions with a common unknown variance and also 
with binomial distributions. He controls the probability of selecting the stand- 
ard as the best when the categories are equal to (or worse than) the standard. 

The procedure described in this paper controls the probability that the selected 
subset contains all those populations better than the control for any possible true 
configuration. If we define a correct decision as a selected subset which contains 
all those populations better than the standard, then the procedure given below 
guarantees a probability of a correct decision to be at least P*, not only when the 
k — 1 populations are equal to (or worse than) the standard, but for any pos- 
sible true configuration. Although we are comparing the procedure with the work 
noted above, it should be stressed that the goals are different and the procedures 
are not interchangeable. It should be noted that the treatment of Sees. 3 and 4 
could be applied to several other distributions in the Koopman-Darmois family. 

The goal treated in this paper is more flexible in that it allows the experimenter 
to choose a subset and withhold judgment about which is the best one. Then, if 
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the best one is desired it can be chosen from the selected subset on the basis 
of economic or other considerations. 

Although the title and discussion above use the phraseology ‘populations 
better than a standard” we shall actually be interested in selecting all popula- 
tions as good as or better than the standard; for practical purposes the distine- 
tion is of minor importance since in most of the practical problems the param- 
eters of interest can have any value in some interval and are very rarely equal. 

To discuss confidence statements we consider first the problems below in 
which the better populations are the ones with the larger values of the main 
parameter of interest 7. After the experiment is performed, we can make with 
confidence P* the joint statement that for all populations which are eliminated 
the parameter value is less than that of the standard. This joint confidence state- 
ment follows from the fact that in selecting a subset containing all populations 
as good as or better than a standard we are automatically eliminating a subset 
containing only populations worse than the standard. Hence this procedure can 
be used to eliminate those populations which are distinctly inferior to the stand- 
ard. 

For the case in which the better populations are defined to be the ones with 
the smaller values of 7, the statistical problem is identical and all the results 
and tables of this paper apply with the obvious modifications. 


3. Location parameter—-normal populations. We shall assume that popula- 
tions II, , Ta, --- , I, with unknown means y,, we, --- , wp», respectively are 
given and that Ip is the standard or control, whose mean yo may or may not 
be known. For clarity we shall discuss the various cases separately. 

Case A. Common lnown variance (uo known). From each of the p populations 
I(¢ = 1, 2,--- , p), n; independent observations are taken. Let #, denote the 
sample mean from II; and let o” be the common known variance. 

Procedure: ‘“‘“Retain in the selected subset those and only those populations 
11(¢ = 1, 2,---, p) for which 


a4) 2m da/V ni bes 


To determine the value of d let p; , pe denote the true number of populations 
with » 2 wo and uw < wo, respectively, so that p; + po = p. Then the prob- 
ability P of retaining all the p,; populations with nu = uo is given by 

1 
p= Piz; = wo — do i/n'} 


+ 
P 


Sma! ho. « —, hy 
™ Piv nit: — wi)/o 2 —d+ Vnjluo — wi)/o}, 
=1 
where primes refer to values associated with the p, populations for which 
u = uw. Hence 


, 


p=] (1 — F(-d + Vatu — ui)/0)} 
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where F(x) refers to the standard normal cumulative distribution function. The 
Ms in (3.3) are restricted by the condition us = ywo and the minimum of (3.3) 
is attained by setting w; = yo(i = 1, 2,---, p;). Since the result depends on 
the unknown integer p; , we can obtain a lower bound by setting p; = p. Then 
using the symmetry of F we have 


(3.4) P = F°(d). 


The equation determining d is obtained by setting the right-hand member of 
(3.4) equal to P* and is given by 


(3.5) F(d) = (P*)"”. 


It should be noted that (3.4) is independent of wo , 7 and n,. Hence with a table 
of the standard normal e.d.f. one can easily find the appropriate d which satis- 
fies (3.5) and is to be used in rule (3.1) for any wo, any o and any vector n, . 

The case when the normal populations have different but known variances 
and the standard is known is treated similarly. The inequality defining the pro- 


cedure for this problem, corresponding to (3.5), is 


(3.6) Z; = wo — dai/V n, 


and the equation determining d is exactly the same as (3.5). 

Case B. Common known variance (uo unknown). In this case no independent 
observations are taken on the standard II). Let Z denote the mean of these 
no observations and let o be the known common variance for all the (p + 1) 
populations. Then the procedure is to select all those populations for which 
the relation 


(3.7) E, 2 Fy — do/V n, 


is satisfied. The equation determining d is obtained by the same argument as 
in Case A and, letting f(x) denote the standard normal density, we obtain 


(3.8 [ I [F (» Vv = os ‘)|s u) du = P*. 


For the special case n; = n(i = 0, 1, --- , p) this reduces to 
em 

(3.9) | F’(u + d)f' u) du = P*. 
=o 


Equation (3.9) is independent of ¢. Hence a single two-way table of d-values 
for different values of P* and p solves the problem for all values of ¢ when 
n; = nit = 0,1,---, p). Tables of d-values satisfying (3.9) for several values 
of P* are given in [2] for p = 1 (1) 10 and in [5] for p = 1 (1) 50. A short table, 
using only two decimals of the original four, is excerpted from [5] (see Table 
I). In the more general case when the populations have different but known vari- 
ances the procedure is defined by 


(3.10) ¥; = % — da./Vn 
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TABLE I¢ 


Table of d-values satisfying (3.9) and used in the procedure defined by (3.7) 


p* 


43 
.68 
85 

97 
.06 
.14 
oo 
26 
31 
.50 
.62 
79 
.90 
.99 


tt 


a 
to 


Porte tl 


Nt Nw b&b bo 
WwW bh b& to bo bo 


ww 
see ee te oe oe eee ee GGG 


em ePW WWW WWW WW W bo bo bo 


Nw NH WH b tO 


; ww wow 


* For a more complete table see [5]. 





and the equation determining d is 


(3.11) [ Il |F (uz /™ + i) | flu) du = P*. 


this reduces to (3.9) in the case when o;/+/n; = constant (i = 0,1, ---, p). 

Case C. Common unknown variance (uo known). As in Case A, n; observations 
are taken only on the p populations II,(¢ = 1, 2,---, p). Let s; denote the 
pooled estimate of o” based on vy = D> fas (2; — 1) degrees of freedom (n; > 1 
for at least one 7). Then the procedure is to select those and only those popu- 
lations II; for which 


(3.12) #; = wo — ds,/Vn;. 


The equation determining d is 
(3.13) [ F°(yd)qly) dy = P*, 
/0 


where 4,(y) is the density of y = s,/¢ = x,/+~/v. This result holds for any uo 
and depends on n; only through the value of ». 

Case D. Common unknown variance (uo unknown). In this case n; observa- 
tions are taken on all the populations II,(¢ = 0, 1, --- , p) and the pooled esti- 
mate s? of o° is based on v = 2 he (n; — 1) df. (n; > 1 for at least one 7). 

The inequality defining the procedure is 


(3.14) E; = % — ds,/V/n;. 
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The equation determining d is 


a x Pp 
(3.15) [ / | I F (« V = + wt) |p ugly) du dy = P*. 
“0 -« Li=l Mh 


For n, = n(i = 0, 1, --- , p) this reduces to 
(3.16) | I F*(u + yd)f(u)q(y) dudy = P*. 
0 — or 


Methods for evaluating this double integral and tables of d-values for selected 
values of P*, p and » are given in [6] and values of d/+/2 for other values of 
p and » are given in [3]. 


4. Scale parameter—gamma or chi-square populations. In this section it will 
be more natural to define the population II; as better than Ip if the scale param- 
eter 6; < 4. 

Case A. 6. known. We assume that the population II,(¢ = 1, 2,--- , p) has 
the density 

1 1 Si_y 


4 — 22 é 
° a : 
s r ( ) 


If x,,(j7 = 1, 2,--- , n;) are the n; observations on II; , then ¢; = Zatie x; has 
the density (4.1) with a; replaced by v; = n,a,; and the procedure is as follows. 
Procedure: ‘‘Retain in the selected subset only those populations 


—z/6; 


TI,(¢ = 1,2, --- , p) 
for which 


(4.2) He 14 aa.” 
vi 


Let q: and gq: denote the number of populations with 6 < 6 and 6 > 6, re- 


spectively, so that g: + g: = p. The probability P of a correct decision is given 
by 


n 
Oo v5 \ 


OS? 


(43) P = I Py < (1+) 


t=1 


where primes refer to the g, populations with @ S 6). Hence, 


(4.4) Pp ={]G,, la +d) ot, 


i=l O. 


where G,,(z) is the c.d.f. of the gamma density in (4.1) with a; replaced by »; 
and 6; = 1. A lower bound to this probability is obtained by. setting 0; = 4 
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and q = p so that the equation determining d can be written in the form 
1 ) 


e-du> = P*. 


rr vy 


Pp 
(4.5) IT ir @ . 
i=1 | >] 0 
\ - ) 
For v; = v(t = 1, 2,--- , p) this is easily solved with the help of a table of the 
c.d.f. of y, = 3x3 with v degrees of freedom. 

Application to normal populations. If 6; = 2c3(i = 0, 1, --- , p) are the scale 
parameters for the (p + 1) normal populations and 2;;(j7 = 1, 2, --- , n;) are 
the n; observations on the population IT; with the mean yu; (known), then we 
retain the population II; in the selected subset if 


(4.6) 2, (zi — w)* S 201 + Dei. 
Ni j=—1 


The equation determining the d in (4.6) is the same as (4.5) with »; replaced 
by n;. 
If the means yw; are unknown and n,; > 1(¢ = 1, 2,--- , p), then in (4.6) we 


9 “3 
use the sample mean Z; in place of u; and n; — 1 in place of n;. The equation 
determining d is again (4.5) with v; = n; — 1. 

Transformation: If we apply the transformation [1] 


(4.7) 


then the procedure (4.2) of this section can be put in the form 


(4.8) y; <= In (*) + dh, 


where 
(4.9) d, = "n [201 + d)]. 


Then using the normal approximation and the same argument as before, the 
approximate equation determining d, is 


a0 file (ay/2)s 


For vy; = v(t = 1, 2,---, p) this gives an equation similar to (3.5). For the 
application to normal populations the equation corresponding to (4.8) is 


(4.11) Ins; Slnaot+d,, 


where d; is determined by (4.10) with v; = n,; or n; — 1 according as the means 
wu; are or are not known. 

Case B. 6) unknown. The assumptions are the same as in Case A except that 
mo observations, Viz., 20, To2,*** , Ton, are taken on IIy. The inequality de- 
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fining the procedure and corresponding to (4.2) is 


(4.12) Ke a+a)%, 


Vi Vo 
y no Tn . ee — ° 5 
where fo = 57; 20; and vp = noao. The equation determining 7 is obtained as 
before and is given by 


ve u 
pvgt(l+d)/% uz * 


(4.13) II | eT 


“0 l'(ns/ o 


Application to normal populations. For the case where the means are known, 
the rule takes the form 


is, ‘ aGa+a<e 2 
n ae (45; — pi) = ~ (x0, v Ho) ’ 
1 y= i=1 


(4.14) 
™% 


where d is given by (4.13) with »; = n,;. If u,’s are not known and 
n; > 1(¢=0,1,---,p), 


then the rule is the same as (4.14) with yw; and n; replaced by Z; and n; — 1, 
respectively. The equation determining d is again (4.13) with »; = n; — 1. 

Transformation: Using the transformation (4.7), we put the inequality de- 
fining the rule as 


(4.15) Yi < Yo + ds = 


The approximate equation determining d, is 


(4.16 [ | TI F (« 4/*: + ds /%) | 100 du = P*, 


which for n; = n(i = 0,1, --- , p) is of the same form as (3.9). 


5. Binomial populations. 

Case A. Known standard. It is assumed that p + 1 binomial populations I, 
with parameters 6,(¢ = 0, 1, --- , p) are given where 4 is the known value of 
the probability of a unit being defective in the standard population Ip . Again 
n,; independent observations are taken from each population 


II, (7 inns : 2. oan » By. 


Since 6, is the probability of a unit being defective, we define II; to be better 
than IIp when 6; < 6. Let x; denote the number of defectives observed in the 
sample of n; observations from II,(i = 1, 2, --- , p). 
Procedure: ‘Retain in the selected subset those and only those populations 
(i = 1,2,--- , p) for which 
(5.1) Le:Sh&td 4/ = = 
Ni 


n, 
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Let q , g2, be defined as in Sec. 4; let [m;(d)| denote the largest integer in 


(5.2) mi(d) = ni + dv/ni0(1 — 6) (¢ = 1,2,---, p). 


The probability P of retaining all the q; populations with @ S 4 is given by 


a [ms (d)] 
(63) p= Th] SE crea - ey]. 


i=1 )=0 


= 
The fact that 6; = 4 gives a lower bound can be shown by writing the binomial 
sum as an incomplete Beta function. Hence the inequality determining d be- 
comes 


p_ ima) 
(5.4) I] 2, craéa — ay > P*, 


A lower bound is obtained by setting 6; = @(i = 1, 2,---, p) and q = p. 


i=l 7=0 
and the solution is the smallest value of d satisfying (5.4). If n; = n then 
[m,(d)] = [m(d)] 


and (5.4) reduces to 


[m(d)} 


> CPA — 6)"” 2 (P*)' ais 


This is easily solved by consulting a table of cumulative binomial probabilities. 

For large values of n; (large enough for the normal approximation to give 
good results) the inequality determining d can be approximated by the simple 
equation 


(5.6) F(d) = (P*)"”, 


where F is the standard normal e.d.f. This equation is independent of n, and is 
much easier to solve than (5.4). 

Case B. Unknown standard. The assumptions are the same as in Case A ex- 
cept that mo observations are taken on the standard population Ip. Let xo be 
the number of defectives among no . 

Procedure: ‘Retain in the selected subset those and only those populations 
I1,(@ = 1, 2,--- , p) for which 


Ni no a Ni MN 


The probability P of retaining all the q populations with 6 S 6 attains a mini- 
mum when 6; = 6(i = 0,1, --- , p) and q = p and is given by 


no Pp [m; (y.d) ] : 
(5.8)  P(,d) = ST] > crea — a] Cy°e(1 — 0)", 


es | =) 
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where [m,(y, d)] is the largest integer contained in 


i lr i. 

- l an 

5.9 mi(y,d) = y+ 4/—4+-. 
n 2 ni Ny 


Then the desired value of d for (5.7) is the smallest number for which 


(5.10) min P(@,d) = P*. 
0ses) 


Since, except for very small n; or very large p, the minimum occurs near 6 = } 


2) 
we can obtain an approximate solution for d by finding the smallest number for 
which 

(5.11) P(3, d) = P*. 


A simpler approximate solution, which gives good results when the n,; are not 
too small and p is not too large, is the normal approximation obtained under 
the assumption that 6; = 3(i = 0, 1, --- , p). Then from (5.7) we obtain for 
the approximate equation determining d 


x Pp ; /n. ar n 
9.12) . f — +. fi+= ( = p*. 
os [ 11 , (u V N% . V "YL fw du I 


For n; = n(t = 0,1, --- , p) the rule (5.7) can be written as 


(5.13) Z < Xo “+ d’. 


/ . . . 
where d’ = dv/n/2. In carrying out the rule we can assume that d’ is an in- 
teger. The desired value of d’ is the smallest integer for which 


0 


[i y+d’ P \ 
(5.14) min < >| = Cre — a cre(1 — 6)” *> => P*. 


6<1 \y=0 


p=() 
‘ 4 


Then (5.12) can be written in the form 


om 


(5.15) | F?(u + d)f(u) du = P* 
— 


and the relation between d’ and d, using a continuity correction, is 


F n (d-v/n — 1) 
(5.16) d’=d WV 5 : wet 

) } 2 

- \ 
where {x} is the smallest integer greater than or equal to z. 

Transformation: It may be desirable to solve the binomial problem by using 
an are sine transformation and converting it into one involving the location 
parameter of the normal distributions. For example, for the Case B above with 
n, = n(t = 0,1, ---, p) if we use the arc sine transformation as given in [4], 
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the inequality defining the procedure is 


/ 


are sin V 


—— / 
° J ° 

+ are sin V ue < arc sin V 

ss 


“ ; 
n+ 1 


. wo +1 dv/2 
+ are sin 4/% - + ————_ 
n+ 1 V2n+1 
where the approximate equation determining d is the same as (3.9) so that 
Table I is applicable here also. 


Li 
7 n+l 1 l 
(5.17) 
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SOME PROBLEMS OF SIMULTANEOUS MINIMAX ESTIMATION 
By SraANISLAW TRYBULA 


Institute of Mathematics, Polish Academy of Sciences, Wroclaw 


1. Summary. In this paper, we give minimax estimates of the parameters of 
the multivariate hypergeometric distribution and of the multinomial distribu- 
tion, and of some parameters of an unspecified distribution with known range. 
We use as loss a weighted linear combination of squared differences between 
the true and the estimated values of the parameters. Some properties of the 
minimax estimates obtained are discussed. 


2. Introduction. For our purpose, it is sufficient to define the estimation 
problem in a fixed sample size experiment as follows ({3], [4]). The random vari- 
able YX is distributed in the space X according to the distribution F belonging 
to the family F. We want to estimate w(F) where w is a function, the values 
of which belong to some space Q, defined on %. (In the following we assume 
that Y and w(F) are vector valued.) An estimate is a statistic f(X) having values 
in 2. The nonnegative function L[w(F), f(x)} is the loss resulting if, when F 
obtains, the estimate f(x) is made. Define the risk by 
@| Rf, F) = E{Lle(F), f(X)}F} 
and call v(f) = supe-3 R(f, F) the guaranteed value for the estimate f. We seek 
the minimax estimate f', that is, the estimate whose guaranteed value is mini- 
mal. Obviously, such an estimate does not always exist. It is our aim to derive 


minimax estimates in some specific problems. 


3. Problem 1. In practice, we often meet the following situation. A lot con- 
sisting of N units of a product has been produced. The units are classified into 


l categories, the ith category containing U; units (¢ = 1,--- , 1). A sample of 
size n is taken from the lot in which /;,--- , k; units of categories 1, --- , l 
are observed. The problem is to estimate U,,--- , U2. 

This leads to the estimation of the parameter U = (U,,--- , U)) of a multi- 


variate hypergeometric distribution. Thus, let 


(1) Gi) 
{Z P! Xi key seeks xX, = ky hey 


Kl = — 


(*) 


~ 


It is known that, 


(3 m, = E(X;|U) =n 


7 


ZO 
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2. gute | err imetim —_ @—*® new — 
(4) oi = E{(X,; — E(X,| WP|U} = NAN U(N — L 


Suppose that the loss is 


l 


(5) L(U,f) = Dedlf(X) — US? (c, 2 0), 
t=1 


where f = (f,,---, f:) is the estimate of U and X = (Xi, --- , X,) is the 
sample. The risk is then 


6) R(f, U) = EL(U,f) | U] = E4> el f(X) — Ud? | US. 
i=l ) 
If we study estimates of the form 


f(X) 


then 


1 
R(f, U) = De: E{[aX; + b; — 
t=] 


l 
= Di el(am + b — Us)? + a’oi) 


i=1 


l r 2 r 
- nU; _ th 2 MN —M) prey _ | 
= 2 CG; | (« N é l :) 2 a NAN — 1) U (N l i) ‘ 


Let the constant a assume a value such that the terms quadratic in U vanish. 
For this, it suffices to put 


If, moreover, we put 


then (7) may be written 
,N 
nN 


oe N_ ee 
(n + 4 w= on ) - 


Without loss of generality, we may assume ¢,; 2 C2 2 --- 2 c; 2 O. For the 


present, assume also that co ¥ 0. Let J) be the greatest index 7 for which ¢; + 0 


(8) R(f, U) = e(Ns; + (1 — 2s,)U,). 
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and let 
“ s—2 
(9) L max E <= h, im Le > |. 
. i=) C, 


The above assumptions being satisfied, we prove the following lemma: 
If L S Il then 


(10) 6 > > «het « L+1,L4+2,:---,l. 


Proor. First, observe that a proof of the inequality is necessary only for 
i= L + 1. If e,.,; = 0, then the lemma obviously holds. If cz41 ¥ 0, it fol- 
lows from the definition of L that 


L+1 


L 
i~temene - L + crus 2, 1/G- 
"5 j=l 


j=l 


The lemma is a direct consequence of this inequality. 
Now put 


1 
E(1 _ ), when i S 
(11) 8j; = {1 - 


0, when i > 


1 ; ; 
Observe that 7 < L,0O < s; S =. We shall show that the estimate 


where 


Xi + 8, VV n N_ — 
(12) f(X) =N osmceessieigeaalll =< » 
a + 4» N wall. 
N-1 
is the minimax estimate sought. 
From (8) and (11) we have 


(13) R(f°, U) =- 


Observe that for 
(14) Ura = Urse = +> 
R(f, U) = c, where c is a constant. By the lemma, R(f’, U) s c. Thus, by 


theorem 2.1 of [4], it is sufficient to prove that a distribution of the random 
variable U exists which satisfies (14) and for which f° is the Bayes estimate. 
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We seek for such a distribution among those of the form 
(15) PUiy = +--+ = U, = 0) = 1 
 Mi«n,-+,~—) + oot tee te 


U;! coe @ | 








Let 


(17) r(f, P) = E[R(f, U)] = Da BBY) — Uj’ | VU}. 


It follows from (15) that the expected risk does not depend on f; if k; + 0 for 
at least one j > L. Thus, any estimate which minimizes (17) throughout the 
region kyi1 = kiz2 = --- = ky = 0 is a Bayes estimate. Now, if 
Kas = Kruse ss eco = ky = 0 
then, as is well-known, the expression (17) attains its minimum value for 
filki, ---, kr, 0, --- , 0) 
= E(U, | X ; ky, oe X, ke ; Bess -. ee X; . 0) 
(0 for: > L; 


(18) | > “ I ( \F (a; ~. U,) 
k; u 


u i+ +up=N 7=1 


upekyrs+ mM LZky oo ar otherwise. 


| - Il u;\ Ma; + w) 
uytesstup=N j=1 k; u; ! 
\ wiz uz KL 


je 


The second part of Eq. (18) reduces to 


x w TLE + u;) 


=N y=1 (1; ~o 


ite eu 
wicks upekn 
oa ; 
- l(a, + u,) 
Das 
Bite stuzeN j=) (u; ce k;)! 
4y2k;,° AL=ky 


L + j — 
[(a, +k +) - a; Il tS as 


*++0p=N—n i=1 v; . 


- jae 


vpt+++0p=N—n j=l 











vr) >0,-°*,on >~0 
Ta; + ky + v;) 
Math +ut+) I a 
pptes+tup=N—n l= vj. 
DLO _ fos — - 
> jy et b+ od 
vite +uL=N—n } Vj. 
vy 0 *,we20 
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Observe that 


(N — n)! Pb + uy) «++ Pox + 1») 


. o - - L 
typtes+top=N—n Vy: "es Te ¢ 
ne0oepz0 r(N-n+ Db; 


i=l 


br-l (N — 2 
a a... oe > _ 
“ . pyt-+-+0p=N—n V1 2 °°" 
vy 20,-+-op>0 


pi --- pi’ dp «++ dpz 


b,-1 by-1 T'(b,) ae I'(bz) 
ee Pr. . 2) aa dp, --- dpp = — a - 


oT 
Pit---+p_=! r Db 
P129,---,pp2o i=] 


Applying (19) to L; with b; = a; + k; for 7 ¥ t and 


b =a,+k +10 =1,---,LZ) 


a; + k; we obtain 


and to M; with b; = 


L 
(a; + k,) (wv +> a) 
7=1 


— a 


(wv = - «) k, + (N — n)a 


ee oe 


Thus f’ is minimax whenever a; > 0; that is, when N > n + 1. For N n 


this result is immediate. For N = n + 1 it is a consequence of the fact that 
f is Bayes for the a priori distribution of U defined by 


Pin = +: = U 


1 = 0) = 1, 


PUU, = w,--:,UL = uw) * a5. 
Up to this point, we have assumed c. ~ 0. Consider now the remaining cases. 
If all ec; = 0 then, obviously, every estimate is minimax. If alone c; ~ 0 then 


the problem may be considered as that of finding a minimax estimate of the 
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parameter U in a one-dimensional hypergeometric distribution for the loss 


L = (f — UY’. In this case, the formula for the minimax estimate is (see [4]) 


X + ; n — + 
(21) POY ti 0 geen”. 


N—n 
n-+ 4/nX= i 


It is easy to verify that the estimate (12) satisfies the condition 


Observe that we are actually dealing with only 1 — 1 independent parameters 
since one parameter, say U,;, may be computed from 


(22) U,+t---+U.=N. 


If we consider the problem of finding a minimax estimate for U, , 
under the loss 


13 
(23) L(U,f) = Daf, -— Ui), 


the same estimate as above for U,, --- , Uy, results as is seen by identifying 
c; in the above with é;(i = 1, --- , 2 — 1) and putting c,; = 0. 

In solving our problem, we have restricted ourselves to the case c; = 0. If, 
however, some c; < 0 then for f; — + the loss tends to — = and, conse- 
quently, the problem becomes trivial. 

In the special case cy} = cp = +--+ = c; > O, formula (12) takes the form 


N - 2 
N- 1 


/ N-—-yn 
+—/ ee 


4. Corollaries for the multinomial case. For VN — <«, the distribution of Y 
converges to the multinomial distribution defined by 


6 le oe 
Xi + - / n 
(24) f(z) = N- : 


P(X, =k, -:: ‘ - : pi a pi’, 
\ coe K 


! 


and 
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We shall prove’ that g° = (gi, ---, gt) is really a minimax estimate of the 
parameter p = (p,,-°-- , pi) for the loss 


(2. 


I 
29) L(g, p) = LD alg: — po’, c; 2 0. 
i=] 


When L, 6 and s; are defined by (9), - and (11), respectively, the loss is 


(26) R(g°,p) = E{L(g’,p) |p} = > (c,s; + dp,) + es c 
(y Vv ‘n —; 1)’ a im L+1 
which for pr4; = --- = p, = 0 is constant and, by the lemma of Sec. 2, maxi- 
mum in p. 
As is easy to verify, g° is Bayes for the a priori distribution G(p) defined by 


Vne;-1 py 


= i (Cp aaa when Pr+1 GB cece = Pi => 0, 
(27) dG(p) = \0 otherwise F 


By theorem 2.1 of [4] it follows that g° is minimax. 
: - 0 ° 
Forc, = +--+ = ¢; > 0, s; = 1/l and the minimax estimate g takes the form 

,, lop 

XxX; + T Vn 

Q r 

gil. ) = cam © 

n+>r>vVn 

This case was previously solved by H. Steinhaus in [5). 


5. Problem 2. We shall prove the following theorem: 

TueoreM. Let X be a random variable distributed according to the unknown 
distribution F on the measurable space A. Let gi, +++ , Gm be such bounded meas- 
urable functions on A that there exist two points x’ x” ¢ A such that each of these 
functions attains its minimum in x’ and its maximum in x”. Let X,,--- , Xn be 
a random sample from F, and let \; = E{g(X)}. If the loss is given by 


(29) Li x A) = > ci( fi ri)” , 


=] 


where f = (f,,-++,Jfm) is an estimate of X = (Ay, ++: , Am), then the minimaz 
estimate of X ts given by 


(30) 2d 9:(X,) 

, {(X,, --- X,) = =~ fe cagggincen 

n+Vn Vnt+1 

(s; is the arithmetic mean of the maximum and minimum values of g;(x)). 
Proor. If ff(X, ooo, X,) = a > ym 9i(X5) ) + b;, then the risk may be 


1 While this paper was being written, Joseph Dubay communicated to me a result similar 
to this but not in its full generality. 
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written 


m m ( n 2 
R(f, F) = aps c(t; — 4)" r| = Yo Ex |< > 9i(X;) + bh; - | F, 
inn] i=] : j=l 
= c{{ — an); — bi]? + naPEf[g(X) — a) | FY}. 
=1 


t 


min gi(x) = g(x’), 8B; = max g(r) = g,(x”). 


zeA zeA 


‘to prove that 


E{fg(X) — rs" | F} (8B; — r)(A — 


Rf, F) s > «fla — an)\,; bi) + na’(B; — r»)(\; — a;)}. 


Putting 


we obtain 


(34) R(f’, F) s —— » 2 &(B; — a)” 
A(Yn+ 1)° i=! 


Observe that if a distribution F of the random variable X is defined by 
P(X = 2!) =1-—p, 
P(X = x”) = p. 

Then A a; + (8; — a,)p, and equality obtains in (32); i.e. 

(36) R(f’, F) = ¢. 


The distribution / depends on the parameter p. Since (34) and (36) hold, it is 
sufficient to show (as in Sec. 3) that there exists a distribution G of p for which 
(30) is Bayes—that is, a distribution G such that (30) minimizes the expected 


risk 


r(f,G@) = EIR(J, F)| = XL E{ ELA: — fi" | Fi} 


= > ELE fl + (8; — a)p — fil’ | p}}. 
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It is easy to verify that this happens for the distribution G°(p) defined by equa- 
tion 


ne 
n/2)—1 


37) dG’(p) = C(pq)‘* 


dp (q=1—p). 


This completes the proof. 


6. In this paper, we have used the loss L = >» c(f; — w,)*. This loss has 
been extensively investigated ({2], [4], [5], [6]). For many special problems, 
other loss functions might be used, for example, 


L=Dal\fi— wi |, 


=) 


about which little is known at present. 

Problems considered in this paper were suggested to me by L. J. Savage and 
H. Steinhaus. I am indebted to J. A. Dubay and L. J. Savage for help and sug- 
gestions made during the preparation of this paper. 
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A THEOREM ON FACTORIAL MOMENTS AND ITS APPLICATIONS 


By P. V. Krisuna Iver 
Defense Science Laboratory, New Delhi 


0. Summary. The theorem that the sth factorial moment for the sum of NV 
events is s! times the sum of the expectations for any s of the events occurring 
simultaneously has been proved by induction. The applications of this result in 
obtaining easily the moments of a number of distributions arising from a sequence 
of observations belonging to two continuous populations and other cases have 
been demonstrated. 


1. Introduction. A number of distributions arising from a sequence of n 
observations belonging to a binomial population have been considered by the 
writer [3], [4] in some of his earlier publications. The moments of these dis- 
tributions were obtained by using the theorem that the sth factorial moment 
is equal to s! times the expectation for s of the characters considered in the dis- 
tribution. Thus for a sequence of observations consisting of A’s and B’s with 
the probabilities p and gq respectively, the sth factorial moment for the dis- 
tribution of the total number of AB and BA joins between successive observa- 
tions is the expectation for s joins like AB and BA in the sequence. It can be 
seen that there are s different ways of obtaining s joins. They are: 


(1) From (s + 1) consecutive observations. 
(2) From two sets of 1; and J, consecutive observations such that j, + l — 2 
is equal to s. 


(3) From three sets of 1; , 2 and 1; consecutive observations such that 


h+bh+h—- 3 
is equal to s. 


(4) From k sets of , 2, ---, & consecutive observations subject to the 
condition 


>~4—k=s, 


1 


where / takes values 1 to s. 

The sum of the expectations for 1, 2, 3, --- , s is equal to 1/s! (the sth fac- 
torial moment for the distribution of the total number of AB and BA joins of 
the sequence). 

The theorem as it stands appears to be applicable only for the distributions 
arising from a binomial sequence consisting of A’s and B’s with fixed prob- 
abilities p and q. We shall show in this paper that this result can be applied 
for distributions arising from two samples belonging to populations with cumu- 
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lative distribution functions F and G. Before discussing this aspect, we shall 
first give a rigorous proof of the theorem and then show how it can be applied 
for the case of continuous distributions. The use of the result for distributions 
arising from Markoff chain is also illustrated. 


2. Statement of theorem and proof. 


THeoreM. The sth factorial moment about the origin of any statistic X which 
is the sum of N events, dependent or independent, is equal to s! times the sum of 
the expectations for any s of the events occurring together. 

Proor. Let the events be denoted by 2, z2,---, zy. As in the case of 
binomial distribution, assume that the z’s take value 1 if the event occurs and 
zero otherwise. Define 


X > x, 


1 
a(> 2) = >, E(z,) 


the sum of the expectations of the different events 


= the sum of the probabilities for the events to occur. 


E(X*) = E()>z,)? = E(D_ 2?) + 2E(D>> 2,2,), 


E(). xt) = E(X); 
hence 
E(X*) = E(X) + 2E(> z,2,) 
or 
(2) E{X(X — 1)} = 2>¢ E(z,2,) 
(sum of the expectations for any two of the events) 


(sum of the probabilities for any two of the events 
to occur together). 


= E>. x) + 3E>_ (xz,) 
+ 3E(>_ 2,23) + 6EQ 222), %t>8>” 


E(>” x?) = E(X), 
E(zxiz,) = E(z,2,), 
E(z,2) E(2,2,), 
E(X*) = E(X) + 6E(> 2,2.) + 3!E(D 2,2,2,). 
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Substituting the value of ES a,x.) from (2), we get 


E(X*) = E(X) + 3E{X(X — 1)} + 3!1E(0 avr) 


(3) E{X(X — 1)(X — 2)} = 3905 E(z,a.2) 


3!(sum of the expectations for any three 
ot the events 
3!(sum of the probabilities for any thre¢ 
of the events to occur together). 
Thus the theorem holds good for s = 1 to 3. 
It may be noted that the results given above hold good even without taking 
the expectation of both sides because the x’s take values 1 or 0 only 
We shall now establish the general relation by induction. For this we 
that if 


(4) oe. gt > Li,Ltg °° * Xe,) 


holds good for any value of s, it is true for (s + 1) al 
Multiplying both sides of (4) by X we get 


[Xl] X] = gt > Li Xin, *** (>. r- 


show 


Substituting for >> x:, 2% --- a, from (4), (5) reduces to 

6) X"\(X — 8s) = X*t" = (84:1)! (Do ay te + te 
Taking ‘he expectation of both sides 

(7) Eix*?™) = (gs + i> E(20, Zt, *** Ze+1 


Hence the theorem. 

3. Applications. We shall now examine how the above result can be applied 
for obtaining easily the moments of a number of distributions including those 
arising from a simple Markoff chain. Some of the distributions considered here 
have been discussed by Wald and Wolfowitz [7], Mood [6], Mann & Whitney [5], 
and others. 


(1) Binomial distribution. It is obvious that the rth factorial moment for the 
distribution of x, the number of successes out of » trials is given by 
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where p is the probability for a success. 
(2) Hyperqcometric distribution. This can be deduced from the above by sub- 
stituting 


» W-mp” 
P Nin e's 


where NV and M have the usual significance. This follows from the fact that the 
probability p for the Ist, 2nd, 3rd, --- successes are 


N—-M-—2 


a5" 


+) Distribution of the number of AB joins between successive observations of a 
binomial sequence. We first note that r AB joins can be formed from only r sets 
of two consecutive observations each and therefore 


Mir) =~ F rf 
7 D ° 
r! r Pd 


This can be seen from the fact that the probabilities for an AB join is pg and 


=< ° es . ; 
that there are ways of obtaining them from » observations in a 


‘ 
sequence 

1) Distribution of AB joins for binomial sequence of n,A’s and n2B’s. As in the 
case of hypergeometric series, we substitute 


pq 


1 the results given in (3) above. Thus 
’ (nm, + Ne — rT) Ny Ne 
10 le i ge aad 
(my + 2)" 


v) 





(5) Distribution of AB and BA joins between consecutive observations of a li- 
nomial sequence. Taking for simplicity the third factorial moment, we note that 
three joins can be obtained from (i) four consecutive observations ABAB or 
BABA, (ii) two sets, one of two and the other of three consecutive observations 
like AB — ABA; BA — ABA; AB — BAB and BA — BAB, (iii) three sets 
of each of two consecutive observations AB or BA. The sum of the expectations 
for the above three ways of obtaining three joins is 

! 


(1 ata = 2(n — 3)p'g' + 8 % cE *) vg + 4 . *) Sp'q. 


6) A sequence formed by pooling two samples A and B belonging to F. Let two 
samples A and B of sizes n; and nz be drawn from a population where cumulative 
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distribution function is F(x), F(x) being continuous and zx taking values from 
—x to +. By pooling together A and B and arranging them in ascending or 
descending order we obtain a sequence of A’s and B’s as in (4) considered above. 
Hence the moments of any distribution arising from this sequence can be ob- 
tained from the corresponding ones for the binomial sequence by substituting 


pq = 


(7) Same as (6), F # G. The calculation of the moments for some of the dis- 
tributions discussed above is more complicated when F # G. We shall show 
below how the present theorem enables us to obtain the moments of these dis- 
tributions also. 

(a) Number of observations of sample A to the left of the rth value of the com- 
bined ordered sequence of A’s and B’s. 


/ 


Lfs - ° . . . ° ° 
! = [Number of ways of selecting A’s from (r — 1) observations| 
8: 


X [Probability that s out of the (r — 1) values to the left of the rth 
observation belong to A] 


Assuming F(a) + mG(a) = r, the probability that amongst the 
values to the left of the rth observation there are s A’s is 

mF (a) (n, — 1)F(a@) (ny — 2)F(a) 
nm F(a) + n2G(a) (m — 1)F(a) + n2 Gla) (rm — 2)F (a) + m2 Gla) 





- s terms. 
Using the relation between F(a), G(a) and r we get 


; (" ~ \ _ nt" [F(a)}* are 
8 rir — F(a@)\[r — 2F(a@)} --- [r — (s — 1)F(a@)} 


(b) Number of AB and BA joins between successive observations. As the higher 
moments are complicated we shall be content to obtain the second moment. 





, 
K[2)} : . . ° ° es . ° 
5, = the sum of the expectations for two joins from (i) three consecutive 


observations and (ii) two sets each of two consecutive observations. 


Expectation for two joins from three consecutive observations x; , v2, and 2 
is given by 


m(m — Im | / | | {1 — F(as) + F(a,)}"~ 


(13) - {1 — Gla3) + G(2,)}"*~ dF (x;) dG(a.) dF (2s) 


+0 pSg pie 
+ nN; N2(N2 —_ 1) / / ‘1— G(z;) 4 G(x) }"*? 


{1 — F(xs) + F(a,)}"" dG(ax:) dF (2x2) dG(zs), Zi < te < 2%. 
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Expectation for four joins from two sets of two consecutive observations each is 
equal to 


se : +H z% Zz pte 
n\n {?! il | / | A dF (a) dG(a.) dF (23) dG(a,) 


+ | L z [L A dF (2) dG(a,) dG(x) dF (2) 


+] f: l [4 dG(x;) dF (22) dF (xs) dG(2.) 


+ | 7 [ [ [ "A dG(a,) dF(a) dG(a,) dF (x0, 
where 
A = {1 — F(ae) + F(x) — F(xs) + F(zs)}"~ 
x {1 — G(x) + G(r) — G(x) + G(a2)}"°, 
Ze X Be <S Be <M Te 
rom the above it follows that 


(15) a} = (13) + (14). 


When F = G, this reduces to the expression known. 
(8) Mann and Whitney’s T-statistic. In this case the expression for the second 
factorial moment reduces to the simple form 
+00 23 
Mia) = ann: | [ [ [(mi — 1)f(aif(ae)g(as) + (m — 1)f(xi)g(a2)g(as)] 
(16) sali 


coe zo 2 
+ dx, dx, dx; + nj” ns | | f(x:)g(ax2) da, dz, | ; 


where f(x) and g(x) are the density functions for F and G. 
(9) AB joins between successive observations for a simple Markoff chain. Let 
the matrix of probabilities for a simple Markoff chain be 


qa gq 
Taking the probability that the first observation is A or B as P and Q respec- 


tively, the probabilities P,(A) and Q,(B) that the rth observation is A or B are 


given by 


P,(A) afi. 4s OP ar, 
(17) 1 — 6 1—- 


Q-(B) l- P,{A), 
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where 
6 = ~pi — Pr and ~i > Pr. 
When the first observation is B, the conditional probability for the rth ob- 
servation to be A reduces to 


(18) P(A|1B) = a a -s). 


This is the same as given by Bartlett: 

In the case of the Markoff chain, unlike the previous cases discussed earlier, 
the probability of an AB join depends on the position of A in the sequence, and 
the expectation for two AB joins is given by 


qilPi(A){Ps(A | 2B) + P(A | 2B) + Ps(A | 2B) + --- + PaalA | 2B)} 
+ P(A){ P(A | 3B) + P(A | 3B) + P(A | 3B) + --- + Py1(A | 3B)} 
(19) + P;(A){P;(4A | 4B) + P(A | 4B) + --- + P,.1(A | 4B)} 


+ P,-(A){ Pra1(A | n — 2, B)}], 


where P,(A | B) is the conditional probability that the rth observation is A, 
given that the sth observation is B when r > s. Summing up the above series 
after substituting for P’s from (16) and (17), we get 


»)! - ? 


wie _ [S— Be as sci gl _ aa - 3) 


. f 
9 73) — 3) 


~ jas 


f oo n—3 ‘ 
~ “ =) — (n — 3)8""> 


where 


’ Pa, — Qpe 
a=- =... and 3 = 1 —— 
i—é 1-6 

It may be added that the result given in this paper can be used for deriving 
the moments of many other distributions of similar kind. 
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THE UNIQUENESS OF THE TRIANGULAR ASSOCIATION 
SCHEME 


By W. S. Connor 
National Bureau of Standards 


1. Summary. Parameters for a class of partially balanced incomplete block 
designs with two associate classes are immediately implied by the triangular 
association scheme. This paper deals with the more difficult question of whether 
or not these parameters imply the triangular association scheme. 


2. Introduction. A partially balanced incomplete block design with two 
associate Classes [1] is said to be triangular [2], [3] if the number of treatments 
v = n(n — 1)/2 and the association scheme is an array of n rows and n columns 
with the following properties: 

(a) The positions in the principal diagonal are blank. 

(b) The n(n — 1)/2 positions above the principal diagonal are filled by the 
numbers 1, 2,---, n(n — 1)/2 corresponding to the treatments. 

(c) The positions below the principal diagonal are filled so that the array is 
symmetrical about the principal diagonal. 

(d) For any treatment 7 the first associates are exactly those treatments which 
lie in the same row and the same column as 7. 

The following relations clearly hold: 

(1) The number of first associates of any treatment is ny = 2n — 4. 

(2) With respect to any two treatments 6; and 6, which are first associates, 
the number of treatments which are first associates of both 6; and @. is 


pi(O; , 0) = n — 2. 


(3) With respect to any two treatments 6; and 6; which are second associates, 
the number of treatments which are first associates of both 6; and @, is 
pis(Os , Os) = 4. 

We wish to examine the converse, i.e., whether or not relations (1), (2) and 
(3) imply (a), (b), (c), and (d). We shall give a proof for n 2 9 which shows 
that the converse is true. The cases with n < 9 will not be considered, although 
the author has found that it is true for several small values of n, and conjectures 
that it is true for the rest. 

As background for this problem, it is interesting to recall what has been found 
for some other classes of partially balanced designs. In the analogous problem 
for the group divisible designs it is easy to show that the converse is true [4]. 
For the latin square designs the converse is true for a sufficiently large number of 
treatments, but is not always true, as has been shown by example [5]. 

The present problem is closely related to problems considered in [6] and [7]. 
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The arguments used here could be substituted for some of the arguments in 
those papers. 


3. A characterization of the triangular association scheme. The proof will 
consist of showing that there exist sets of treatments which satisfy the following 
theorem. 


THEOREM. The triangular association scheme for n(n — 1)/2 treatments exists 
f and only if there exist sets of treatments S, . j 1, ---. mn, such that: 
1) Lach S; consists of n — 1 treatments. 


(il) Any treatment is in precisely two sets S 

ili) Any two distinct scts S; , S; have exactly one treatment in common. 

Proof. Necessity follows from the observation that the n rows of treatments in 
the triangular association scheme are the n sets S 

Sufficiency follows from noting a correspondence between the rows and 
columns of the association scheme and the sets S;. To display the correspond- 
ence, we denote the unique element common to sets S,; and S; by a(7,7) = aj, 7) 
Then the correspondence is as follows: We let set S, correspond to the 7th row 
and column, and put element a(7, 7) in the 7th row and jth column of the asso- 

iation scheme. Because a(7z; , 7; a(iz, je) implies that 7, tz and j; = je, 
oe element a(7, 7) occurs only in the 7th row (column) and jth column (row). 
This fills up the association scheme as described in (a), (b) and (ec). Further, if 
we let “belonging to the same set S,” correspond to “‘being Ae associates”’, 
then (d) is satisfied. 


4. The existence of sets S; which satisfy the Theorem. In this section we 
shall show tor n 2 9 that there exist sets S; which satisfy the Theorem. The 
proof makes conspicuous use of the condition (3) that Pi = 4. In fact, in con- 
structing the proof, the author was attracted to the singular fact that this 
parameter does not depend on n. 

Throughout the proof, we shall employ certain conventions. In citing a reason 
why something is or is not true, we often shall write “D11(8; , 92)”’ or “oi 63 , %),” 
whereby we mean to refer to particular treatments 6; , 62, 4;, and 6,. Also, we 
shall write ‘‘(@; , 6) = 1 (or 2).”’ meaning that treatments 6; and 65 are first 
or second) associates. 

In developing tie proof, the author used a matrix in which the 7th row and 
column correspond to the 7th treatment, and the entry in the intersection of the 
ith row and jth column is 1 or 2, depending on whether treatments 7 and j are 
first or second associates. Though this matrix is not explicitly used below, it is 
implicit, and it is believed that the reader will find the use of this matrix helpful 
in following the proof. 

We begin by proving a lemma which will be used repeatedly in the sequel. 

LemMMa 1. With respect to any two initial treatments 0, and 6, which are first 
associates, the n — 3. (n = 9) treatments which are first associates of 6, and second 
associates of 0. pairwise are first assoctates. 


Proof. For simplicity we shall replace @,; by 1 and @ by 2. From (1) and (2 
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it follows that there are n — 2 treatments which are first associates of both 
treatments 1 and 2, and xn — 3 treatments which are first associates of treatment 
1 and second associates of treatment 2. We shall refer to the treatments of the 
first set as treatments 3, --- , m; and to those of the second set as treatments 
n+ 1.--.,2n — 3. These sets will be denoted respectively by 

fy = Ts, 2.8) 
and T; = T(n +. 1, --- ,2n — 3). 

We first show that any treatment @ in 7, cannot have more than one second 
associate in Ts. We observe that pul2, a) = 4, of which one such treatment is 
treatment 1. Thus, treatment @ has at most three first associates in 7 . Beeause 
pull, w) = n — 2, treatment a has at least n — 5 first associates in 72, and 


hence at most one second associate in T.. 
We now shall show that even this one second associate is impossible. Con- 
sider any two treatments @ and 8 in 7, , and assume that (a, 8) = 2. We have 


established that treatment 1 and the n — 5 treatments other than a and £8 
in 7. are first associates of both a and g. But for n = 9Y the condition that 
pule, 8) = 4 is violated, which shows that (a, 8) = 1. This completes the 


proof of Lemma 1. 

Our next lemma shows the existence of sets S; which satisfy (i) and (ii) of 
the theorem. 

LemMa 2. For n = Y, any initial treatment 6 is an element of exactly two sets 
of treatments S, and S, which are such that a set contains n — 1 treatments, the 
treatments in a sel pairwise are first associates, and 6 is the unique element common 
to S; and So. 


Proof. We begin by showing that Lemma | implies that there are n — 4 
treatments in 7; which pairwise are first associates. For this purpose, it is con- 
venient to define sets 7; = 7\(3,---, n — 2) and 

% = Tan + 2,--- ,2n — 3). 

Krom Lemma 1 and the condition that pi(1, a) = n — 2 for every treat- 
ment a in 7, , it follows that every treatment in 72 has two first associates and 
n — 4+second associates in 7, . Without essential loss of generality, let treatment 
n + 1 bea second associate of every treatment in ¥4 , and let (n — lhn + 1) = 
(n,n + 1) = 1. Then by Lemma 1, letting #6; = 1 and @ = n + 1, the treat- 
ments in T; pairwise are first associates. 

We still have to determine how treatments » — | and n intersect the treat- 
ments in ; z iA and each other. We shall show that (n — 1, n) = 2 and either 
we have Case 1: (n — 1, a) = 1, (n, @) = 2 for all treatments a in 7; and 
(n — 1, 8B) = 2, (n, B) = 1 for all treatments @ in sf ; or we have Case 2: 
(n — 1, a) = 2, (n, a) = 1 for all a in T; and (n — 1, 8) = 1, {n, 8) = 2 for 


all B in T; : 


° . . . gar? 
Suppose that treatment n — | is a second associate of some treatment in 72 , 
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say treatment 8. We shall show that we have Case 1. From Lemma | and Pil 1, B) 
it follows that treatment 8 has two first associates and n — 4 second associates 
among the treatments of 7; and treatments n — 1 and n. Therefore, treatment 
8 has at least n — 6 second associates in T; Without essential loss of generality, 
let these be treatments 3, --- ,n — 4. Applying Lemma 1 with 6, = 1 and 6 

3, it follows that these treatments are first associates of treatment n — 1. 

Suppose that (n — 3, n — 1) = 2. Then it would be necessary that 
Piu(n — 3, n — 1) = 4. However, treatments 1, --- , n — 4 are first associates 
of both treatments n — 3 and n — 1, violating piln —3,n—1) =4forn = 
9. Similarly, treatment n — 2 cannot be a second associate of treatment n — 1. 

We have shown that if treatment n — 1 has a second associate in T3, then it is 
a first associate of every treatment in ,. Further, the treatments in T; and 
treatments 2 and n+ 1 satisfy the condition that pull, n-—-l)=n-—2, 
implying that treatment n — | is a second associate of treatment n and the treat- 
ments in Te ° 

By applying Lemma | with 6; = 1 and @ = n — 1, it follows that (n, 8) = 1 
for all 8 in T;. Because treatment 2 and the treatments in T2 satisfy p};(1, »), 
it follows that (n, a) = 2 for all @ in T; . This demonstrates Case 1. 

If (n — 1, 8) ¥ 2 for any 6 in T:, then (n — 1, 8) = 1 forall 8 in T;. But 
treatment 2 and the treatments in T, satisfy p},(1, n — 1), and it follows that 
(n — l,a) = 2forallain 7; and that (n — 1,n) = 2. We now apply Lemma 1 
with 6, 1 and 6. n — 1 to show that (n, a) = 1 for all a@ in T; . Because 


treatments 2,---, nm — 2, n+ 1 satisfy pi(1, =n — 2, it follows that 


n, 8) = 2 for all Bin T;. This establishes Case 2. 


We now observe a set S; which contains treatments 1, 2, the treatments in 
7 and either treatment (mn — 1) or n. Also, a set S. which contains treatments 1, 
the treatments in 7», and the one of treatments n — 1 and n which is not in 
S,. These sets are such that their elements pairwise are first associates. They 
are the sets of Lemma 2. 

To show that there are no other such sets, we shall consider the way in which 
the treatments in Tj are associated with the treatments in T;. Consider any 
treatment @ in t, , and the condition pull, a) = n — 2. Treatment 2, the re- 
maining (n — 5) treatments in ’; , and either treatment n — 1 or n aren — 3 
treatments which satisfy this condition. Hence there is exactly one more such 
treatment in 7 . Similarly, any treatment @ in T; has exactly one first associate 
in T; . It follows that no other set of n — 1 treatments exists such that its treat- 
ments pairwise are first associates. This completes the proof of Lemma 2. 


The sets found in Lemma 2 obey (i) and (ii) of the Theorem. To find the num- 
ber s of sets and to prove (iii), we observe that each of s sets contains n — 1 ele- 
ments, so that there are s(n — 1) (not necessarily distinct) elements in the s sets. 
But every treatment occurs in exactly two sets, so that s(n — 1) = 2v = n(n — 1) 


or s = n. Thus the number of pairs of sets is n(n — 1)/2 = v, and because every 
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THE LIMITING DISTRIBUTION OF BROWNIAN MOTION 
IN A BOUNDED REGION WITH INSTANTANEOUS 
RETURN’ 


By B. SHERMAN 


Westinghouse Research Laboratories 


1. Summary. A point executes Brownian motion in a bounded, connected, and 
open three dimensional region D. When it reaches the boundary I, at point a, 
it is instantaneously returned to D according to probability measure ula) (we 
write u(a, A) for the measure of set A), and the Brownian motion is resumed. 
This is a Markov process and, subject to certain regularity conditions on T and 
u(a), we derive the limiting distribution of the process. Processes of this sort 
have been considered by Feller [1]; he has obtained the transition probabilities 
of such processes. He is concerned more generally with Markov processes with 
continuous sample functions on a linear interval; the return may be instan- 
taneous or after a random period of time. 

Let p(t, —, A) be the probability that the point is in set A of D at time ¢ when 
it is initially at point & of D, with the additional restriction that no boundary 
contacts have been made. It is known that 


- 


(1) p (t, t, A) u(t, &, x) dz, 


“A 
where dr is the volume element about z and u is the solution of the equation 
} Au ide, 


subject to the condition- 


u(t, &, a) 0, ae2Tt. lim | u(t, &, 2) dx = 1, 
t-0 -“C 
where C is any sphere of non-zero radius with center & which is entirely within D. 
We may write explicitly 


2 
where \, is the kth eigenvalue and 1,(x) the corresponding eigenfunction of the 
equation Au + 2Au = 0 subject to the boundary condition u = 0 on IP. If 
K(é, x) is the Green’s function of Au = 0 in D, then? ((2], and [3], page 273) 
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.@ 
2) | u(t, &, x) dt. 
0 


\s 


K(é, x) = 


Nl 


Let off, £, a) dtda be the probability the point is absorbed at surface element 
de of T between t and / + dt when it is initially at point & of D. Then ¢ is half 
the interior normal derivative of « at point @ of T([3], page 273). When the point 
is initially at € the probability of ultimate absorption in set S of T is given by 

= 
(3) a(t, S) = | [ o(t, &, a) dt da. 
* 8 0 
We may define a discrete parameter Markov process with T° as state space by 
taking as transition probability 


(4) tla, S) = mt, S)ula, dé). 
“D 
This Markov process has a limiting distribution x which satisfies the equation 


(5) r(S) | tla, S)r(da 
Tr 


We define a measure of sets of D by 


N(A) ula, A)r(da). 
J] 


We may now write the density function for the limiting distribution. If 
M(£) is the mean time of reaching the boundary when the point is initially at &, 


= 


Mig = | | ip(t, E, a) dt da 


“TP 0 


then the density function of the limiting distribution is 


. ~D 
()) = 


2 / K(&, x) A(dé, 


| M(é)A(dE) 

*D 
If we are given a probability measure \ in D and the return is always according 
to A, then it is clear that the limiting density of this process is also given by (6). 
If \ concentrates at a single point £ we may drop the integrals in (6), and in par- 
ticular we get 


M(t) = 2 K(é, x) dx. 
We note that (6) is essentially the steady distribution of temperature in the 
following problem: D is a homogeneous heat conducting body whose boundary 
ix kept at temperature 0 and in which there is a constant source of heat distributed 
nccording to X. 
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Regarding the regularity conditions, we shall assume that T is made up of 
finitely many surfaces, each with a continuously turning tangent plane and 
that D has a Green’s function ({4], page 262). We will assume there is a ¢losed 
set B in D such that 


inf ula, B) = y > 0. 
acl 

2. Origin of the problem. This problem had its origins in the ecological re- 
search of Professor Thomas Park of the University of Chicago. He has been 
investigating problems of population stability and inter-species competition of 
flour beetles. It was discovered, on statistical investigations suggested in part 
by Jerzy Neyman, that the distribution of the beetles in the container of flour 
was not uniform, with the density increasing toward the boundaries of the con- 
tainer. The problem arose as to whether the nonuniformity might be simply a 
consequence of the random motion of the beetles or whether it ought to be at 
tributed to some inhomogeneity such as a temperature gradient in the flour. 
To check the plausibility of the idea that the nonuniformity might arise from 
random motion alone, we have set up a model which may have some relevance 
to the actual situation. The region D represents the volume of flour. We assume 
the independence of the motions of the beetles so that we may confine ourselves 
to the random motion of a single point. This is a reasonable assumption if the 
density of the beetles is low. For the random motion we take Brownian motion; 
this is appropriate if we want path continuity and spatial homogeneity. Finally, 
we must introduce some mechanism of return from the boundary; we use the 
device of instantaneous return. If the return distribution is concentrated near 
the point of contact on the boundary then the device has some semblance of 
plausibility. More precisely we may suppose u(a, A(a)) = 1, where Afa) is 
that set of points of D whose distance from a is less than or equal to 6, a small 
positive number. Then if £ is the subset of points of D whose distance from 
r is in excess of 6 we have A(E) = 0. If we are prepared to accept the density of 
distribution in £, as given by (6), as a theoretical model for what is observed 
then we are faced with a contradiction. For the density is a harmonic function 
in £, by virtue of A(E) = 0. Because of the minimum-maximum properties of 
such functions we cannot have increasing density from the central parts of FE 
outward to the boundary of FE since that would entail a minimum at an interior 
point of FE. 


3. Derivation of the limiting distribution. We sketch a proof that we have 
defined « process by the instantaneous return mechanism. This is equivalent 
to proving that finitely many contacts occur in a finite time with probability 1. 
By the assumption on yw it will happen infinitely often with probability 1 that the 
point is returned to B. Let T(x) be the time to reach T, starting at x. If 4 is posi- 
tive Prob (T(x) > 6) is a continuous function of « which achieves a positive 
minimum on B. Thus of the times the point is returned to B it will happen in- 
finitely often with probability 1 that the time to reach T is in excess of 6. This 
implies that infinitely many contacts in a finite time has probability 0. 
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Let p(t, x, A) be the transition probability of the process, i.e., the probability 
the point is in A at time ¢ when it is initially at z. Then we prove the limiting 
distribution p exists and 


(7) p(t, z, A) = p(A) + fil, 2, A), 
where 
(8) S(t, z, A)| < ae 


Here a and & are positive and independent of x and A. To simplify the notation 
we will make the following convention: if f(z) is a function on D then we define 
a corresponding function f(a@) on I by taking the integral, over D, of f with re- 
spect to the measure yu(a). With this convention we may replace xz by a in (7) 
and (8). We note that both f(t, z, A) and f(t, a, A) are integrable with respect to 
tfrom Oto «. 

Proceeding with the proof we use the fact that u(t, —, x) is strictly positive 
for all € and x in D and for positive ¢. Then the minimum o(z, 4, f), achieved by 
u(T, &, x) subject toe B and ¢ — 6 S T Sf, is also strictly positive, and it 
follows directly that for/ —-6 < T S 4, 

‘ 


p(T,a,A) 2 ¥ | v(az, 6, t) dz. 


“A 
It is clear that 
— 
n(s) = int | | $(t,z,@) dtda 
zeD ~T +0 


satisfies 0 < h(5) < 1 for all 6 > 0. Let p'(t, &, A) be the probability the point, 
initially at &, is in A at time ¢ having made exactly one boundary contact. Then 
if t > 6, 


. 


c t 
p(t,t,A) = | [ o(7, £,a)p (t — r,a, A) dr da 
r 0 


6 


= | o(7r, t,a)p (t — 7,a, A) drda 
Jr Jo 

= [ o(r, &, a) dr da-y [ v(x, 5, t) dz 
“T <0 “A 


= h(é)y | v(x, 6, t) dz. 
“A 
We follow now the proof of a similar theorem given by Doob ([5], page 197). 
If m(t, A) and M(t, A) are respectively the infimum and supremum of p(t, £, A) 
as — varies over D, then M(t, A) = m(t, A) and by the Chapman-Kolmogorov 
equation it can be seen that M(t, A) is non-increasing and m(t, A) is non-de- 
creasing. For fixed fy) , & , zo define the set function 


WA) = plto, &, A) — plto, te, A) 
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There is a set A~ on which y is maximum, such that ¥(A) 2 O for any subset A 
of A*, and such that ¥(A) S 0 for any subset A of ATW = D — A*. We have, 
assuming 6 such that 0 < 6 < ty, 


y(A*) l1— P(to, t, A ; = P(lo, , A) 
.-> Pp (to, to, A’) ae p(t, %, A’) 


<1 — hi) | v(z, 8, t) dz =c <1. 
D 


Following now a line of argument analogous to Doob’s we have 


Mit, A) _ mit, A) < citite —f 


> 


from which it follows that M(t, A) and m(t, A) have a common limit p(A) and 
that 


p(t, x, A) — p(A)| S M(t, A) — m(t, A) S oer. 


Thus (7) and (8) are established, with a = c’ and k = 1/t log ce’. 

Before deriving (6) we have to establish the existence of the limiting distribu- 
tion x of the boundary process. To this end we prove the lemma 
(9) ¢ = sup (max x(z, S) — min z(z,S8)) < 1. 

se. zB zeB 
We note that x(x, S) is, for fixed S, a harmonic function of 2({3], page 273). 
If (9) is not true there will be a sequence of sets S, such that 
(10) max 2(z, S,) — 1, min z(z, S,) — 0. 
zB z¢B 

Since #(z, S,) is a sequence of harmonic functions with 0 S x(z, S,) < 1, we 
may extract a subsequence which converge to a harmonic function f(z) uni- 
formly on any compact subset of D((4], page 249). Without change of notation 
we suppose this done. However (10) implies that f(z) achieves the values 1 and 
0 on B, which contradicts the fact that f(z) is harmonic in D and 0 S f(x) S 1. 

To prove the existence of the limiting distribution of the boundary process 
we again follow the lines of Doob’s proof. For fixed a and 8 we define the set 
function 

v(S) = (a, S) = 4(B, S). 

Associated with y are the sets S~ and S , and we have 


W(S*) = 1 — (x(a, S) + 2(8, S*)) 


(/ a(x, S )pla, dx) + i a(x, S*)p(8, dz) 
/B B 


y (min x(z, S°-) + min x(z, S*)) 


zeB zeB 


y(1 — max x(z, S~) + min x(z, S*)) 


zeB zeB 


y(1 — £). 
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. ) Y ( ) + . . 
If now we introduce m‘”’(S) and M‘”’(S), the infimum and supremum of the n 
step transition probability z'"’(a, S) as a ranges over I, then following Doob’s 
proof 


M‘”(S) — m(S) Ss (1 — y(1 — 9)", 
from which it follows that the limiting distribution exists and satisfies (5). 


We are now in position to derive (6). We have 


° t 
p(t,é,A) = p (t, £,A) + I. [ o(r, £, a)p(t — +, a, A) dr da. 


Introducing (7) and integrating with respect to ¢ we get, after some reductions, 


- . 
pt, é, A) dt = p(A) E (1 — [ | o(r, £, a) da ir) 
J Jr 


T 


0 
(11) 


aT . 
+ | | ro(r, , a) da ar| + I(T,&, A), 
0 -r 
where 
aT aT . at 
(12) 1(7,& A) = | fe, Avat— | | | 06,8, eft — 7,0, A) dr dade. 
“0 “0 <I 0 


The second term in the bracket on the right of (11) tends to M(£) as T 
and we sher. that the first term tends to 0. This term can he written 


(15) T Prob (x(t) e D,O < t S T | 2x(0) = &). 


Let the coordinates of point x be 2; , x2, «; and suppose D is contained between 
the planes x; = a and 2; = —a. Then (13) tends to 0 if the expression 
(14) T Prob (—a < a(t) <a,0 <t ST} 2(0) = 


+ 
c 
s 


tends to 0. We may write (14) explicitly 


< » } (2n + 1)*x° 
7 ——__—____— sin a) ex ——— 
2, (2 za r( 8a? 


which is less than 


. ne 4 (Qn + 1)?e’T 
(Lé eS ee 
415) 2, (Qn + 1px exp ( 8a? ), 


and it is easily proved that (15) tends to 0. Letting T —- ~ in (11) we get 
(16) [ p'&, A) at = p(A)M(®) + I(@, &, A). 
Jo 


teferring to (12) and (3) we may write, on introducing the variables 7’ 
and t/ = t — 7, 
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2 


I(o,t, A) = [ f(t, &, A) dt - | (/ o(r’, &, a) ir’) ([ f(t’, a, A) dt’) de 
“T “0 0 


“0 


fn [ f(t, &, A) dt — | ({ f(t, a, A) at) m(é, da). 
0 “r \Jo 

‘The integration of the right side with respect to measure \ is equivalent to con- 
secutive integrations with respect to u(8) and x. The first integration gives, 
using (4), 


ee o 


| fle, B, A) dt — I. ([ f(t, a, A) at) x(8, da); 


and the second, using (5), gives the value 0. Thus integrating on both sides of 
(16) with respect to measure \ we get 


| [ p (t,&,A) dt Mdt) = p(A) | M(é)X(dé). 
“D “0 “D 


This equation, together with (1) and (2), implies that (6) is the density function 
of the limiting distribution p. 
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ON THE DISTRIBUTION OF A STATISTIC BASED ON 
ORDERED UNIFORM CHANCE VARIABLES 


By SHantr S. Gupta! anp MILTON SoBEL 
Bell Telephone Laboratories 


1. Summary. The exact distribution of a statistic based on the r smallest of 
n independent observations from a unit uniform distribution is derived. In 
life-testing terminology, this statistic includes as special cases (i) the sum of the 
r earliest failure times, (ii) the total observed life up to the rth failure, and (iii) 
the sum of all failure times. The density, cumulative distribution function 
(e.d.f.) and first four moments of the general statistic are summarized in Sec. 2. 
Section 3 gives the derivation of the density and c.d.f. The moments are obtained 
from the moment generating function in Sec. 4. Asymptotic normality under 
certain conditions is proved in See. 5 and illustrations of the rapidity of approach 
to normality are given in Sec. 6. 


2. Introduction and statement of results. We shall consider the statistic 
(2.1) Trm =htht--- +t, + (m— ryt, 


where ¢; = ¢$” is the ith smallest of n independent observations and m is greater 
than r — 1 but is not necessarily an integer. For m = n this statistic can be 
interpreted as the total observed life in a life-testing experiment without replace- 
ment. When the underlying distribution of the unordered f’s is exponential, 
i.e., f(t) = (1/0)e*”*, then it is known [3] that 27{%2/@ is distributed as chi- 
square x3, With 2r degrees of freedom. 

Before stating further results let us introduce for 0 < ¢ S m and non-nega- 
tive integers p, q, n 


n—1 n—l a\n-l 
99 (q.n mE t _({P (¢ — 1) _ (2); — 2) a) dyes 
a ae *1(3) me (?) (m — 1 * \2) (m — 2) P 


/ 


~ 





where m > p,n = 1 and the summation is continued as long as the arguments 
t,t — 1, ¢ — 2,--- are positive. It is understood that the binomial coefficient 


J 
tion. 


? . . \ : 
(?) = 0 forp < j so that there are at most (p + 1) terms in the above summa- 


(n) 


It is clear from (2.1) that 7,,", is the sum of all the n observations. When the 
underlying distribution is unit uniform, then the density of T™ is given on p. 
246 of [2] by 


\ firey 1 i n oe ) ; 
Ze (nd = antl ) n—1 -( ) n—1 5 ae / (n,n . 
(2.3) fan (t) in — 1)1\\o t 1 (¢-—1)"" + ” 4 (*-") (4) 
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(We have removed the superscripts and subscripts from the chance variables and 
put them on f and on F below which are the symbols for the density and c.d_f., 
respectively. 

Using the symmetry of the above density about ¢ = n/2, we can replace ¢ 


=» 
by n — tin (2.3) obtaining 
-1 
— 1\(n — 2)” 
n 
\ 


n— 1 (n -l- a" (n,n) 
chaniapaigpsiiplamnanine coer = ¥ scons — t), 
1 ) (n = 1) + j 1 n t 


/ 


where 0 < ¢ S n. The form (2.4) is more comparable with the results derived 
here. It is shown below that the density and e.d.f. of T/%. are given by the com- 
parable results 


a) 


(2.5) feml(t) = Apin(m — 2) 


and 


y y(n 1 nn 
(2.6) Pent w i « ~~ A = 8, 
(n + 1) 


from which we get as special cases the densities and c.d.f.’s of (i) T)?, (ii) TL 
and (iii) T\",. 

Barton and David [1] have derived another equivalent formula for the derisity 
e(n) 


f,", (t), ie., in the special case (i). Their result, with two typographical correc- 
tions taken into account, is 


, ° . n—l 
(Py =2> (-yv (7) inte lint Fs | 
Tr. i=l a z 


The total life statistic arises as an optimum statistic under exponential dis- 
tribution assumptions in [3]. In the present paper we give the distribution of 
this statistic when the exponential distribution assumption is replaced by the 


uniform distribution. Hence these results can be used to study the robustness of 


the tests based on the total life statistic. The results on asymptotic normality 
are also of interest in this connection since under the exponential assumption 
the distribution of 27° /@ is that of x3, which, for large r, also is close to that of a 
normal distribution. It is felt that the model of a uniform distribution from 0 to 
6, 8 > 0 and unknown, and the results of this paper may prove to be useful in 
some life-testing problems. 


3. Derivation of results. Let u = t +4 +---+t+t4,7=t,w = u/v 
and y = TS = u+(m—r-+t 1)v, where ¢; = ¢{” is the ith smallest of n 
independent chance variables uniformly distributed from zero to one. The con- 
ditional distribution of w given v is exactly that of a sum of r — 1 independent 
uniform chance variables and is given by (2.3) with n replaced by r — 1. Hence 
the joint density of v and w is given by 
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n! 1 
gu.e) = FF “(1 — 9)*” - 
; (F—Dia—7!* (r — 2)! 
(3.1) - 
r—1 i r—l pet 
. w= paw iy + ***F, 
c 0 ) ( 1 ye ) 
where 0 S w Sr — 1landO <v S 1 and the joint density of u and v is given 
by 
n! 
h(u, v) = ———_———————~ 
(u, ») (r — 2)!(r — 1)! (n — r)! 
(3.2) 
—<—_~ l r—2 = r—? } n—r 
{(’ 0 yn ‘-(° \ ‘Yu = 0) “+ as —- wv)", 
where 0 S u S (r — 1)v and v S 1. 
If we now derive the density of y, then the full range of y from 0 to m is broken 
into r parts. For 0 < y S m — r + 1, the density of y becomes 
n! 
fay) = - 
YG = DIG — Din —n! 
ey/(m—r+1) : 
[fy —(m — r+ Le] — v)"" ah 
~y/m 


(3.3) 


ey/ (m—r +2) 


x (" 1 ') ] ly — (m — r + 2)e]7(1 — 0)" do 
“y/m 


/ r—2 a ] stead s ir—2 nor 
+--+ + (—1) € ») | [fy — (m — 1)e] “1 — 2 dv. 
r “yim 


Using the finite difference operators &, A (with 6 = 1 + A), 


r—1 
(n) r - ra 
Frm w) = oa (")E( a ) 


-y/ (m—r+14+2) 7 
(04 | ly — (m —r +1 4+2)0]" "0 — 2)" dv> 
y 


\“yim 


(3.4) 


where & operates on x and it is understood that x is then to be set equal to 0. 
Using the relation between & and A, 


in r n 
i) = <5, (°) 


{ pyl(m—r+1+2) 
i A) "| | 


\ Jyjms 


(3.5) 7 
[ly — (m —r +14 2)0o) (1 — »)" “a 


If we now integrate by parts, the first term vanishes at the upper limit and also 











at the lower limit because of the operator A”’. After r — 1 such integrations we 
obtain 
n—l 
Ga) 2W) = 2 [ata tits why 
(r — 1)! \ (m—r+1+2)—™ m0 
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Using A = & — 1 we obtain 


frm (y) 


[(r-1 mae ("5 BF ges! 
! 0 my ttt l (m — 1)*! } 
(n,n) 


= Ag in(m — Te 


where 0 S y S$ m — r+ 1 and A‘%% is defined in (2.2). 

We shall now show that the expression (3.7) gives the result for all y(0 S y S 
m). Form—-r+igySm—-r+1+1( = 1,2,---,r— 1) the only 
difference is that the first 7 upper limits of integration in (3.3) are all changed to 
unity. For the jth integral (j = 1, 2, --- , 7) we have to add to the complete set 
of r terms in (3.7) the quantity 


yn’ jig ') i ahd 
J—- V(r — 2)"r — 1I)'(n — 1)! 


-1 


fy — (m — r + jel — v)"” dv 


((- 3) Satti wo 
j-lf (eer t+zr - 


~“y/m—r+)) 


(r — 1)! 


For each j(1 S 7 S 7) the quantity on the right in (3.8) cancels the jth term 
from the end of the complete expression with r terms in (3.7). Hence for 


m—-r+isysSsm-rt+i+t+l 


the density is given by the first r — 7 terms of (3.7) which are precisely those 
terms with positive arguments. This proves that the expression A r(m — y) 
of (3.7) gives the result for all yO S y S m). 

The c.df. Fy n(y) of y is easily obtained by integrating (3.7) between the 
limits 0 and y and is given by 


1 A innth( 


(3.9) Frm(y) =1—-; im (m — y). 


(n + 1) 


4. Moments of »y = 7:". Using the expression for the density it can be 
shown that the moment generating function M¢(y) of y = re is given by 


r—1 
M.ly) FE n > (—1)" ’ —- " 
r= 1)! a=() a 


( 


(4.1) 


-m—z 


ro 1 
, \@m — x)**t! Jo 


e"(m — x — y)"™ av} 
peal) 


RO pntee gym, 
(r— meBGr al : m) |» 


(—6)’ 
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Thus we have for the jth moment 


43) 8G) @ U2 pte - wy 
(r — 1)'(m +- - 9)! 


(4.4) = ea = (—1)’m ‘4 is 1 + ’) a aay 
(r — 1)\(n + 9)! p=0 B 
It can be shown that for 7 2 0 and r 2 
i roAyrti) me BT ei 
(4.5) aa, = FORK assera, 
a=] (a _ 
The results for various values of j in (4.5) are known and are given, for example, 


in [4], p. 127. Using these we have from (4.4) 


i 
(2n + 1) 


2) r(r + 1) eat — ie os 
E(y’) in + Dm + [12m — l2m(r — 1) + (Vr — L(38r - 2|, 


(4.6) Ely) = 


,’ 


rin —r+1)(2m —r +1)’ rir + 1)(r — 1) 
4(n + 1)*?(n + 2) 12(n + 1)(n + 2)’ 


9 
Ey’) r(r + 1)(r + 2) 


S(n + 1)(n + 2)(n + 3) 


o (y) 


| sn’ — 12m*(r — 1) + 2m(r — 1)(r — 2) —r(r — v|, 


rr +1) + 2) + 3) 
2(n + 1)(nm + 2)(nm + 3)(n + 4) 





. ES — 4m*(r — 1) + m(r — 1)(8r — 2) — m(r — 1)*r 


at 1)(15r* — 15r* — 10r + 8) | 

Since the computation of cumulants leads to no simplification, they have not 
been given here; they can be obtained by the usual formulae. It should be men- 
tioned that the above expressions for the moments can also be obtained directly 
by using the moments of the order statistics. 

5. Asymptotic normality of y = 7%". . We shall randomize the order of the 
chance variables 4; , fe, --- , ¢--1 and thus define new unordered equi-correlated 
and identically distributed chance variables 1m , v2, --+ , U1. Furthermore, if 
we consider the conditional joint distribution of the u; given v (=/,), then we 
have independent chance variables which are uniformly distributed from 0 to v. 
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Let 


(5.1) > Ety) _ Ely v) 
a(y) o(y |v) 


ry i 
where y = — = Uy + Ug tees + Ui + (mM — rt ide. 
The characteristic function of y* is given by 


and y? _ ¥ 


’ 


1 


gee ze) if a) 
gy () = | | 4 exp {it (2 FW) \ - g(v) dv 


j i=l 


al t , he o( - 
- | l + [exp “wily Fis) 


o(y|v) 


+ it o(v) [zt v) = Ao} | 1 ee ov) de, 
a(y) a(v) i inl v 


, , vv, 
E(y|v) = (rf — 15+ (m— rt le = (Qm — r+ 1) 


= 


where 


£4) etgi) = 0 g/t! Wer 
GA) oyid= ei) = GH SHEED 


and 


' 


en ) ) Tn. 1 p)” 
dw ain = —_ 
d (7 — In — 7)! 


—— 


Letting 


o(y |v) 
oy) 


and 2; = (7 = 1,2,---,r— 1), we obtain 


al 1 \ Ly r—1 — 
ee Ss = feat) 
(5.7) lf oy ae az | e * te) 40(v) dv 
q 0 


(5.6) 


z v—B (rv) 
(5.8) = <= | e © #4) “g(v) dv. 


Since for r = An and n — «x we have 


5 My) = r aed ‘a. rn —r+1) _ (+) 
(5.9) Et ——% XN and oft av" a = 0 Ja} 


then we shall write »v = \ + O(1/+/n) in the expression (5.6) for ¢’ which is 
needed for the first part of the integrand in (5.8). For m = yn we obtain the 
two asymptotic relations 





SHANTI 8. GUPTA AND MILTON SOBEL 
oyly) | 
oy) V3 —-NQ@y-d! +e 


- garners t oO) 
6 SS —<<5 
V3(1 — \)Qy — AP + 





(5.10) 


and 


a(v) r—1\|].~ V3(1 — A)Qy — »)? 
(5.11) ee mn — ae p—J = —————_ ——— > 

o(y) 2 V3(1 — A)(2y — A? +? 
so that if we denote the first term in the right hand members of (5.10) and (5.11) 
by a and b respectively, then a’ + b’ = 1. Taking the limit in (5.8) as n — » 
with r = An, m = yn and using the Lebesgue theorem, we can bring the limit 
operator under the integral sign. Then, using (5.10), we obtain 


— bie Lo ae 1 ale see 
(5.12) ef) = : lim} 1 —- =——__ + 0 {|_—,, lim e* l-e@) Jq(v) dv 


2(r — 1) n*! 


i el o[?—F Ce) 
(5.13) =~¢ 7 | lim e" FFT lg(e) dv. 
0 


Using the same Lebesgue theorem the limit operator can be taken outside the 
integral sign. Then, using a result on the asymptotic normality of quantiles 
given on page 369 in Cramér [2], we obtain 

a2 he? ¢2 


(5.14) yg, (t) = a 2 6 S. san e 2 


since a + 6 = 1. This proves the asymptotic normality of y for r = Xn, 
m = yn (y and X fixed withO <A S landA Sy < x~)andn— =. 

It should be noted that the above proof holds no matter how fast m tends 
to infinity. If m/n — =< then a = O and b = 1 and (5.14) still holds. 


6. Illustration of rapidity of approach to normality. To illustrate the rapidity 
of approach to normality of the statistics, we shall use the Edgeworth series 
expansion 


f 1 : a 
/ . of (x) 3 
a 


Fem(t) = {8(z)} 


1 


(1 (us 4 10 (us) 2 
+(a(4-3)¢ ) + 0 (4) H(z) + ---, 


where #(.°) is the standard normal e.d.f., ®” (x) is its rth derivative and x de- 
notes the standardized variate corresponding to /. We wish to compute one, two, 
and three terms of (6.1) as indicated by the braces for the two special cases of 
TE. ; viz., (i) m = rand (ii) m = n. These have been computed for n = 10, 
r = 5 and the results are compared in Table I below with the exact values com- 
puted from (2.6). 


(6.1) 
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TABLE I 


Comparison of exact probability P(T;" < t) and Edgeworth approximations 


Approximations 

















—— t x a Exact 
en Probability 
1 term 2 terms 3 terms 

i 1.5 0.26656 .6051 .6340 .6318 .6327 

r=65 2.0 1.24393 .8932 .8851 . 8849 . 8839 
m= 5 2.5 2.22131 . 9568 .9761 .9780 .9769 
n= 10 3.0 3.19868 . 9993 .9975 .9969 .9973 
il 4.0 0.30754 .6208 .6312 =| .6261 .6259 
r= § 5.0 1.15329 8756 .8736 . 8681 .8671 
m= 10 6.0 1.99902 .9772 .9723 .9741 .9739 


n= 10 7.0 2.84475 .9978 -9963 . 9976 .9979 


7. Acknowledgment. The authors wish to thank Prof. J. W. Tukey for his 
helpful comments and suggestions. 
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DETERMINING SAMPLE SIZE FOR A SPECIFIED WIDTH 
CONFIDENCE INTERVAL 


By FranKuIN A. GRAYBILL 
Oklahoma State University 


1. Introduction. If an experimenter decides to use a confidence interval to 
locate a parameter, he is concerned with at least two things: (1) Does the in- 
terval contain the parameter? (2) How wide is the interval? In general the 
answer to these questions cannot be given with absolute certainty, but must 
be given with a probability statement. If we let a be the probability that the 
interval contains the parameter, and let 3 be the probability that the width is 
less than d units, then the general procedure is to fix a in advance and compute 
8°. The value of 8° is in general a function of the positive integer 7, the sample 
size by which the confidence interval is computed. (3° is also a function of a 
In most confidence intervals, 8 increases as n increases. For any particular 
situation 3 may be too low to be useful, hence an experimenter may wish to 
increase 8 by taking more observations (increasing n). The problem the ex- 
perimenter then faces is the determination of n such that (A) the probability 
will be equal to a that the confidence interval contains the parameter, and (B 
the probability will be equal to 6° that the width of the confidence interval will 
be less than d units (where a, 8, and d are specified). 

To solve this problem will generally require two things: (1) The form of the 
frequency function from which the sample of size n is to be selected; (2) Some 
previous information on the unknown parameters in the frequency function. 

This suggests that the sample be taken in two steps; the first sample will be 
used to determine the number of observations to be taken in the second sample 
so that (A) and (B) will be satisfied. 

For a confidence interval on the mean of a normal population with unknown 
variance this problem has been solved by Stein [1] for 8 = 1. 

The purpose of this paper is to determine n, to satisfy (A) and (B) for dis- 
tributions other than the normal. 


2. Theory. Suppose X is the width of a confidence interval on a parameter 
uw With confidence coefficient a. Suppose further that it is desired that the prob- 
ability be 8 that Y be less than d. The problem is to determine n, the number 
of observations, on which to base XY. Since n depends on the random variables 
used in step one, n is a random variable. 


We will prove the following (we will use the notation P(A) for the prob- 
ability that the event A occurs): 

THeEoreM. Lei the chance variable X be the width of a confidence interval on a 
parameter » based on a sample of size n. Suppose that X depends on n and on an 
unknown parameter 6 “9 may be the parameter yw). Suppose also that there exists a 
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Junction of X, 6, and n, say g(X; 6, n), such that if Y = g(X; 6, n), then the dis- 
tribution of Y does not depend on any unknown parameters except n. Let f(n) be 
a function of n such that 


1) PiY < f(n)] = B for any 0<£B <1. 


Let the solution of the equation g(x; 0,n), = f(n) for x be x = h(6, n), and sup- 
pose the following are true for x > 0: 


(a) g(x; 6, n) ts monotonic increasing in x for every n and 6. 
h(@, n) 1s monotonic increasing for every n. 
h(6, n) is monotonic decreasing in n for every 8. 
z is random variable which is available from step one of the procedure 
such that P{t(z) > 6) = 8 forO < 8 <1, where t(z) is a function of z 
which does not depend on any unknown parameters or on n. 


Let d and 8 be specified in advance. Then tf n is such that the equation 
3) h{t(z),n] Sd 
is satisfied (t(z) is known) then the following inequality is true: 
}) P(X sd) = 8. 
Proor. Substituting into Eq. (1) we get 
5 Plg(X; 0,2) < f(n)] = 8B. 
Solving for X and using 2(a) gives us 
(6) P(X < h(6,n)} = 8. 
For any 6, = @ we can use 2(b) and obtain 
(7) PIX < h(@,,n)| 6 = 6] = P[X < h(0,n)] = B. 
3v considering the joint distribution of X and é(z) we can write 
8) P(X < d) = PIX < d,t(z) > 6] = PIX < d| Uz) > 6)-Plt(z) > 4). 
If n is any integer satisfying 
(9) h[t(z), n] S d, 
i 


we can use (7), 2(¢), and 2(d) in Eq. (8) and obtain 


9 


ras @ 2 é. 


If the function in 2(b) is monotonic decreasing, then the theorem is also true 
but the inequality in 2(d) must be reversed. The theorem is also true if the 


function in 2(a) is monotonic decreasing. The conditions (2) may appear quite 
stringent; however, many of the functions in common use in statistics satisfy 
these conditions. 
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3. Illustrations 
Example 1. Suppose we want an @ confidence interval on the variance o of 
a normal population to be less than d units in length with probability of 6°. 
We will define 


l n 
Ss, = D (v; — 5)’, 
n—1 1=1 ; 
where v, is distributed normally with mean uw and variance o. An a confidence 
. 9. . 
interval on o is given by 


> ial 1) 8, 7 « (n | =a, 
x1 (n) x2 (n) 


where x}(n) and x(n) are such that 
- x2 


W(x; n) dx? 


“0 


x 


: W(x"; n) dx* 


“(mn 
JX, tn 


where W(x°; n) is a Chi-square frequency function with n — 1 degrees of free- 
dom. The width of the interval is 


¥ « (n=) si{ an | 
xi(n) xi (n) 


It we let 
1 1 


xi(n) xi (n) 


we have g(X; 0, n) = X/o°C, = Y, and we see that Y is distributed as W(x’; n) 
and is independent of any unknown parameters except n. 
Also f(n) is given by 7" W(x’; n) dx’ = B, and h(@, n) a C,f(n). 
Suppose in step one of our procedure we observe u;, Us, °** , Um Which is a 
random sample of size m from a normal population with variance o. If we let 


m 
z= 2) (u, — a)’, 
t=1 


then since z/o° is distributed as W(x’; m), it is clear that P{(z/o°) > fal = 8, 
where fm is such that 

f= 

| W(x’; m) dx’ = 8. 

“Im 


Hence t(z) = z/fm, and since all the conditions in (2) are satisfied, the sample 
size for the desired length of the confidence interval is the smallest integral 
value of n satisfving 








DETERMINING SAMPLE SIZE 285 


f(n) - Ca -z 
Ju 


Example 2. Next, suppose it is desired to determine the sample size such 
that an @ confidence interval on the mean of a normal population will have 
width less than d with probability 6’. Let v,, v2, --- , v, be a random sample 
of size n(to be determined) from a normal distribution with mean yw and vari- 
ance o° = 6. If we let 


sd. 


3. = - l DHlv, _ b)’, 


n—1 
then an a@ confidence interval on p is 


8 - 
p— Me sysot 
Vn 


loSn 
Vn’ 





where fp is such that 


| U(t, n) dt = i 


“ to 2 


,99 


where U(t, n) is “Student’s” distribution with (n — 1) degrees of freedom. 
The length of the interval is 


. to 8a 
X =2 Ea 
Vn 
If we let 


(n — 1)s%, = n(n — 1)X* 


Y = 9 X:6,n) = — 
: o* 4t3 o? 











then Y is distributed as a Chi-square variate with (n — 1) degrees of freedom, 
and is independent of any unknown parameters except n. 

If W(x’; n) is a Chi-square frequency function with (n — 1) degrees of free- 
dom, then f(n) is given by 


~J(n) : 
| W(x’; n) dx’? = B, 


“0 


and 


y Vi(n 
X = h(n) = %o —VLM_. 
Vn(n — 1) 
Suppose u, U2, °°: , Um is a random sample of size m from a normal popula- 
tion with variance o which is available from step one of our procedure. 
If we let 


z= >> (u; — a)’, 


1-1] 
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then we have P{(z/o") > fn! 3, where f,, issuch that f7, W(x°; m) dx® 


Hence, ¢(z (z/fm)°, and since all the conditions in (2) are satisfied, the sample 
size for step two is the smallest integral value of n satisfying 


Me 


2oV2 Vf(n) 
Vin VUn(n — 1) 


d. 


It is interesting to compare the method in this paper with the method pre- 
sented by Stein [1] for setting a confidence interval on the mean of a normal 
population with unknown variance. 

The procedure presented by Stein is to select a two step sample. Suppose 
the sample in the first step is u,v. +--+: , U» and is taken from a normal popu- 
lation with mean w and variance o. An @ confidence interval on pu is 


tm 8 : tn 8 
Sust+t , 
Vm Vm 
where s = 1/(m — 1 ia. (u; — @) and ¢,, is the appropriate value from ‘“‘Stu- 
dents” distribution with m — 1 degrees of freedom. The width of the interval 
is 2t,,s/m* and if this is less than the desired width d, no second step is required. 
a i a ; 
If 2/,,s/m? > d, then n additional observations w; , we, --- , w, are taken where 
n = (4t,8°/d>) — m, and the a@ confidence interval is 


ts 


Zz seamen 


Vm +n 


where 


nu + mi 


m+n 


The width of the interval is 2¢,s/(m + mn)’, and this is less than d. 
It is to be noted that observations in the second sample are used only to 
compute the mean, 2. 


Let us assume that the observations in the first step are taken from a normal 


yopulation wi nean gw; and variance o; , and in the second step the mean is 
ulation with meat nd variar nd in tl nd step the 1 

* 2 ‘ —— ‘ c oe 9 9 
we and the variance cs. Stein’s method is valid if u, = po and oj = o2. How- 


ever, if wu, ~ wo, but oj = o3, the method ean still be used to set a specified 
confidence interval on ye ; the only alteration is that the second step requires 
a sample of size n + m and Z is the mean of this sample. That is to say, the 
sample mean from step one is not used in computing the interval. In this case 
if the ine juality 


in Stein’s procedure is compared with the inequality 


to 2-f( nm) 


— <d 
Vifm:n(n — 1) 








DETERMINING SAMPLE SIZE 287 


for the method presented in this paper, it is evident that Stein’s procedure is 
to be preferred. 

Next suppose that #1 ~ uw: and oj ¥ o:. Then Stein’s procedure gives a con- 
fidence interval on pw, with known probability (equal to 1) of a specified width 
but the confidence coefficient is not known. The method presented in this paper 
will give a confidence interval on py, with unknown probability of a specified 
width, but with known confidence coefficient. 

Therefore, there may be cases when an experimenter would prefer the method 
in this paper over the one given by Stein for the mean of a normal! distribution. 
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NOTES 


AN EXTENSION OF THE OPTIMUM PROPERTY OF THE 
SEQUENTIAL PROBABILITY RATIO TEST 


By M. A. Grirsuicx! 
Stanford University 
Let f(x, @) be a family of densities or discrete probability functions depending 
on the parameter 6. Let Ho be the hypothesis @ = 6) and H, the hypothesis 
that 6 = 6@,. A sequential probability ratio test of Ho versus H, is defined by 


two numbers A and B. After drawing the mth observation, sampling is con- 
tinued if 


o »< Wee < 


where x, °°: , 2» are the first m observations. If the probability ratio is at 
least equal to A, H, is accepted, and if it is not greater than B, Ho is accepted. 
For any sequential procedure 7’, let the operating characteristic be 


(2) L(6, T) = Pr {Accepting Hy | 6, T}, 


and let &(n | 7’) be the expected number of observations required by 7° when 
sampling from f(x, 6). The so-called optimum property (see [5], for instance) 
of a sequential probability ratio test, say 7*, is that if L(@, T) = L(@, T*) 
and L(@,, T) <= L(@,, T*), then 


&e,(n | T) = &,(n | 7*), &o,(n | T) = &,(n | T*). 


In many cases this optimum property can be extended to all values of the 
parameter. Suppose 6) < 6, , and let 6 be a number to be defined later such that 
0 < 6 < 6. Under conditions stated below, we give the extended optimum 
property. If 


(3) L(6, T) = L(6, T*), 0< 8, 
L(6, T) = L(6, T*), @> 6, 
for all 6 ¥ 8, then 


(4) Ee(n | 7) = &e(n | T*) 

Received May 9, 1956; revised November 15, 1957. 
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of his colleagues, but was unpublished at the time of his death. Since I think the result 
is of sufficient interest to be in the literature, I have taken the liberty of writing this note 
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for all 6. Inequalities (3) indicate the premise that T is everywhere as good as 
7* in the sense that the operating characteristic for 7 is at least as high as for 
7* for 6 on one side of 6 and is as least as low as for T* on the other side of 0. 
Then 7* is everywhere as good as T in terms of expected number of observa- 
tions. 

To demonstrate the property we assume that for 6 # 6 there is a unique 
nonzero root, say h(@), of 


m . | f(z, 6) 
(5) &e ES oy) = ] 


and that h(@) > 0 for 6 < 6 and h(@) < 0 for @ > @. (See [4] for discussion of 
the assumption and of the technique used here.) This implies that given % and 
6, the value of 6 for which the assumption holds is unique. We make the further 
assumption that for each @ there is a @ such that 


») f(z, 61) on r" 
(6) EE uy f(z, 6) = f(z, 6’). 


We now prove (4) for 6 < @ by assuming (3) for 6 and 6’. Since 





h(@’) = —h(8), 


we have 6’ > 6. The sequential probability ratio test 7’* defined by (1) can also 
be defined by 


rs 6 
(7) B h (8) < | Rents 1 oy < Aho 
- I f(x, 0) 


or by 


ry f(x;, 6’) Ae 
(8) zhe f(z. , / @ 
oe . * i=l f(a ’ 6) . - 





Then (4) follows by the usual optimum property because 7* is a sequential 
probability ratio test for testing hypothesis @ versus the hypothesis @’. For 
# > 6a similar argument can be used. 

The conditions assumed for this extended property are satisfied by many 
distributions. In particular the existence of such so-called conjugate pairs for 
distributions of the Koopman-Darmois form has been shown [2]. Savage [3] 
has shown that the assumptions restrict the families to have a certain exponen- 
tial form (which includes the Koopman-Darmois form). This note makes ex- 
plicit Blasbalg’s statement [1] that a sequential probability ratio test is opti- 
mum at an infinity of parameter points. 
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A NOTE ON BALANCED DESIGNS 
By V. R. Rao 


University of Bombay 


0. Summary. It is proved that a necessary and sufficient condition for a 
general design to be balanced is that the matrix of the adjusted normal equations 
for the estimates of treatment effects has v — 1 equal latent roots other than 
zero. 


1. Estimates and their properties. We consider a design whose incidence 
matrix is Nye = [n,,;] in which the 7th treatment is replicated r; times and the 
blocks are of sizes k, , --- , k». With the usual assumptions, the adjusted nor- 
mal equations for the treatment effects are 
(1.1) Q = C7, 


where 


(1.2) Q=T — N diag (E sar ) B 
C1 vb 


and 


(1.3) C = diag (r,, --+, 1%) — N diag (., _ k) N’ 
‘ vb 


1 
with the condition 


(1.4) E,,7 = 0 


(where E,, denotes a p X g matrix with all its elements as unity). 

It is well known that if rank C = v — ¢t, a set of ¢ — 1 independent treatment 
contrasts are not estimable. But if rank C = v — 1 every contrast is estimable 
and in this case the design is said to be connected. 

If the design is connected there are v — 1 non-zero latent roots, say, A, 
Ao, «++, Av-1- As the rows of C add to zero, (v**,---, v "”) is the latent vector 
corresponding to the root zero. 

Let 


; fs i. ig] 
or =| 5, |- [i 


be an orthogonal matrix transforming C into diagonal form. 
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Since L is orthogonal, 





(1.6) I= UL =libh+! Ew. 
. 
Pre-multiplying (1.1) by L, we get 
ie ig ( hae 
an t= 104 = (tak sede) 45 


Hence we get 





1 ] 
Af = i ; aQ. 
In? = diag (;... 5) L,Q 


Premultiplying by Li and using (1.6) and (1.4), we obtain 





(1.8) # = DQ, 
Where 
(1.9) D = {d;) = Li diag I I Ly. 
cue) ,’ ’ iP 
From the solution (1.8), it follows that 
V(#) = Do’, 
(1.10) V4, — 4) = di + djj — 2 dijo’ 
= Fe = bil 
vel Av : 
Average variance = = peas a . Vie — 4) 
vT - 
(1.11) ier 
- 2 $1 
a= 1 vl \y 


in view of the orthogonality conditions, a result which was obtained by O. 
Kempthorne in an alternative way [1). 

Derinition. A design is said to be balanced if every elementary contrast, 
7T; — T; Is estimated with the same variance. 


2. Theorem. A necessary and sufficient condition for a design to be balanced is 
that C has v — 1 equal latent roots other than zero. 

To prove that the condition is necessary it is enough to show that \; = --- = 
Ae-1, for the C matrix of a balanced design is of rank v — 1. 

From (1.10) and (1.11), we get 


— 


r—1 i a 1)? 9 r— 1 . 
SE wx ig ee wet the = Os 


— 





oat r», o— 112i, 2; k v= 





292 
Hence 
(2.1) 


Consider 
V(#;, — #;)) + V(4i — fe) — 2 Cov (*%; — 4;, 
Hence, 


= 4 


r-l r-l v—1l 
> (bi — by) Mls = lon) ab wil Zz 3 (li — lj )(li — bee). 
vol 


rN 7 = 1 v= he vi =) 


Hence, 


h 1 Fi 
, ~~ l oat Ay: 


v1 
2» (ba; — bvj) (lo; — lon) ( 


From (2.1) and (2.2) taking 7 = 1, we get 


i v—i 


qa” diag (; — : 
(2.3) 


where d” is the column vector 


fla — by, ln — 5, °°° 


then 


and det. M’M = v # 0. Hence M’M and hence M are non-singular. 


) =Q fori #j #k, 


Therefore d*’, --- , d’ are v — 1 linearly independent (v — 1)-vectors. Any 
(v — 1) vector, say £, can be uniquely expressed in terms of these vectors, sa 
i 1 


t= C,d” + Cd” + --- + Cd”. 
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From (2.3) it follows that 


£ ding (| — -s peng hin: sal Li )r=0 
At u l y 


— A, a v— ] r 
(2.4) 
- Gj") a ] | ] ] ] 1 ( 
ae C;C;d”’ ° dis _— mali ie gieeap ie ane —~ id” =0, 
», " diag (t ier er 42 ;) 
By taking successively — to be the (v — 1) vectors (1, 0,--- , 0), --- and (0, 
Q,---, 1), we get 
Low® be a dl 
Mi Ae Ne 
Hence 
Ay Ag -°> = hey. 
The condition is sufficient, for, if rank C = v — 1 and \y = --- = Avy = A 


(say), it follows immediately that every elementary contrast is estimable, and 
the solutions become 


gutz'touwt(;—!B =@ 
f= >LihQ (1 Eu) Q = 


which shows that V(#; — #;) = (2/\)o*, which is independent of both 7 and 
and hence, the design is balanced. Q.E.D. 
Corouuaries. (i) [f the design is balanced, then 


se 3 ae 
2.5) C = iI — -E,, 
and the solutions are 
( 
(2.6) 7, = ei 


(ii) In a balanced design with equal block sizes, k, the replicate numbers must be 
equal, 

Proor. C = diag (r;,--- , re) — 1/kANN’ if block size is constant. Hence by 
Eq. (2.5), if the design is also balanced, we have 


Hence, r; is the same constant for all 2. Q.E.D. 
(iii) Jf all the treatments are replicated the same number of times and the blocks 

are of the same size then the only balanced design is BIBD, if such a design exists. 
Proor. If r is the number of replications and k is the block size, then 


C 


ll 


Deuces 
rl — 7; NN 


a —* B,, by Corollary (i). 
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Hence comparing off-diagonal elements, we get 


ky 
Ni "> 

i 
where \,;; is the number of times the pair of treatments 7, 7 occur together in the 
blocks. Since \,;’s are all equal the design is Balanced Incomplete Block Design 
(BIBD) [2]. This result was proved in an alternative form by W. A. Thompson 
[3]. 

3. Concluding remarks. But these do not exclude the possibilities of the exist- 

ence of balanced designs with different block sizes and the same number of repli- 
cations. As an example consider the design whose incidence matrix is 


00 0 
ito 
: 2 oF 
ee 


r = (6,6,6,6); k= (3,3, 3,3, 2,2,2,2,2 


“; 


2) 


Here it can be verified that every elementary contrast is estimated with a 
variance equal to 30°/7, but the design is not a Balanced Incomplete Block 
Design. 

It can also be seen that the example given above is obtained by adjoining two 
BIBD’s with the same number of treatments. Such designs can be constructed 


from two BIBD’s with the same number of treatments. Investigations on these 
lines are being carried out. 


Acknowledgement. The author wishes to express his indebtedness to Professor 
M. C. Chakrabarti for suggesting this problem and for his help and guidance in 
preparing this note. 

REFERENCES 
{1] O. Kempruorne, ‘‘The Efficiency Factor of an Incomplete Block Design,’’ Ann. Math. 
Stat. Vol. 27 (1956), pp. 846-849. 
[2] F. Yares, “Incomplete Randomized Blocks,’’ Annals of Eugenics, Vol. VII (1937), 
pp. 121-140. 
[3] W. A. THoompson, ‘‘A Note on Balanced Incomplete Block Designs,’’ Ann. Math. Stat. 
Vol. 27 (1956), pp. 842-846. 


—_—_—_—_—SE Pee 


THE SPACING OF OBSERVATIONS IN POLYNOMIAL REGRESSION 
By P. G. Guest 
University of Sydney, Australia 


1. Introduction and summary. De la Garza ({1]}, [2]) has considered the esti- 
mation of a polynomial of degree p from n observations in a given range of the 
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independent variable x. This range may conveniently be taken to be from +1 
to —1. He showed that for any arbitrary distribution of the points of observa- 
tion there was a distribution of the n observations at only p + 1 points for which 
the variances (determined by the matrix X7 W X) were the same. He then 
considered how these p + 1 points should be distributed so that the maximum 
variance of the fitted value in the range of interpolation should be as small as 
possible. In the present note general formulae will be obtained for the distribu- 
tion of the points of observation and for the variances of the fitted values in the 
minimax variance case, and the variances will be compared with those for the 
uniform spacing case.! 


2. Spacing for minimax variance. The fitted value is given by 


(1) ur) = 2) L,(x)¥;, 
j= 


where 1,(.) is the Lagrangian coefficient corresponding to the point of observa- 
tion z, and gj, is the mean of the observed values at this point. The variance 


» 
of the fitted value is var u,(x) = 7 Lj (x) var 9;. 
j= 


At a point of observation 


and 


var u,(r var ¥ 


The largest value of this variance will be as small as possible when the n ob- 
servations are equally divided among the p + 1 points. When this is done 


(2) var u,(7;) = (p + l)o sn 
and 
> 
2 fa\in 3 2 
(2.1 var u,(z) = > Li (x)(p + 1)o/n. 
j=0 


Since this is a polynomial of degree 2p, the minimax variance conditions are 
obtained when the maxima of var u,() are at the p — | internal points z; , and 
the end points z and z, are +1 and —1; for then var u,(x) never exceeds 


(p+ ljo/n 
in the range +1 to —1. The minimax variance conditions are thus 


(3) L}(2x;) = 0, jz=iltop-—1. 


1 K. Smith, in an earlier discussion, has given details of curves up to the sixth degree 
(Biometrika 12 (1918), pp. 1-85). 
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Now, if 
p 

(+) F(<) = II (x — 2;), 
p20 

then 


F( x) 


(x — a) F'(x;)’ 


La) = 





and so 


F" (x) = {(x — 2 LL" (x) + L(x) F(x i, 


4 


and (3) is equivalent to 


(d) FP" (x2;) = 0, j3=1top — lI. 
The function F(.r) will be of the form a(a° — 1)¢,-1(2), where the polynomial 
¢,-1(a) of degree p — 1 is determined by the p — 1 equations (5). The poly- 


nomial which satisfies these equations is readily shown to be the derivative 
‘ . . . “s 
P(x) of the Legendre polynomial. For if 


(6) F(x) = a(x” — IP (a 
then 
F'(xz) = a c (x - 1)P',(x)! = ap(p + 1)P,(x) 
and 
F(x) = ap(p + 1)P>(2), 
and so F”(.c) vanishes at the internal points F(z) = 0. 


The points of observation for minimax variance are then to be located at 
. , 
+1, —1, and the roots of P,(r) = 0. 
Since the internal points of observation are points of maximum variance, 
the variance will be given by an equation of the form 


= \ 2 42 t 2 
(7) var u,(x) = {1 + B(x — 1)Pp(x)}(p + lo /n. 
The minima of the variance curve then occur at peints for which 
y 2 ” 
2? s(x) ig = LP 5 (x) = Q, 


and this equation is equivalent to 


(8) P(x) = p(p + 1)P,(z). 
From (2.1), 
z. { (2? — 1) P(x) is ‘ 
(9 var u,(z) = mismanaged Pi 1)o/i 
- 2G — x)ap(p + 1)P,(a)} _ en 
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and so, on comparing the coefficients of P(x) in (7) and (9), 
B= t {p(p + 1)P,(z;)\~. 
The Lobatto quadrature formula [3] with f(x) = 1 gives 
[ ae > 2ipp + 1)P2(2)}7 = 2. 


Thus the explicit formula for the variance of the fitted value is 


(10) var u,(r) = { 1+ J P? (x) be + 1)o°/n. 
In the region of extrapolation, when ‘z!| is large 
P(x) = p{(2p)!/2’pP yx" 
and so 
(11) var u,(r) = n{(2p)!, 2’ p!}*2"? a n. 


3. Uniform spacing. When the observations are spaced at equal intervals the 
variance of the fitted value is 


\ 


Pp. ( n _ 
var u,(r) = 4 T(z) / > Ti(zx,)) o’, 
j= 0 | =] ] 


‘ 


where the 7,(x) are the polynomials orthogonal over the n points of observa- 
tion z,. When n is large these polynomials will approximate to multiples of the 
Legendre polynomials P(x) which are orthogonal over the continuous range 
+1 to —1. Thus 


T (x) ~ kj P(x) 
and 
1 
> Ti(x,Ar; ~ kj [ P3(x) dx = 2k;/(2j + 1). 
The interval Ar, between neighboring observations is 2/n, and so 
> T?(x,) ~ nk5, (27 + 1) 


and 


(12) var u(x) ~ Do (25 + 1)Pi(x)o"/n. 


3=0 


The maxima and minima of variance are at points given by 


Pp. 
DX (2j + P(x) Pj(z) = 0, 
j=0 
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Which from the recurrence relations for Legendre polynomials is 


p 


: ; 
z P;(a){ P54.1(x) 


ri 

The points of maximum variance are then the roots of P(x 0 and the points 
of minimum variance the roots of P,.,(2) = 0. It is interesting to observe that 
the points of observation in the minimax variance method are points of maxi- 
mum variance in the uniform spacing method. These points are also the points 
used in the Lobatto quadrature formula. 

The Christoffel-Darboux identity [3] for the sum in equation (12) leads to 
the alternative form 


(72.1 var u,(z) ~ [P(x (x nl DP 91a(Z) 


By use of the recurrence relations for the Legendre polynomials this can be put 
in the form 


y= l 9 


(12.2) var u,(z) ~<(p + 1)P3(a) —° Py (a)7 (p + lo /n. 
p+ ) 
At the end-points +1 and —1, P3(2) is unity and 


var u,(+1) ~ (p + Ll) ao /n. 
At the centre of the range the variance can be obtained by substituting the 


values of P,(0) and P50) in (12.2). It is found that 


QUADRATIC 


Fic. 1. The solid curve shows the variance of the fitted value for the minimax variance 
method and the dotted curve the variance for the uniform spacing method. The unit for 
the variance scale is o2/n 
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(2g + 1)(2g — 1 
(13 var u,(0) ~< “4 > = —_— 
24g! 


where g is 4p when p is even and 3(p — 1) when p is odd. In the region of ex- 
trapolation, when r|is large (12.2) gives 


var u,(z) = (2p + 1)i(2p)!/2 pl fro /n. 


The deviations from these formulae when n is not large have been discussed 
and tabulated [4]. 


4. Comparison of the two methods. In the central part of the range the uni- 
form spacing method gives a smaller variance than the minimax variance 
method. An asymptotic expansion of (13) using Stirling’s factorial approxima- 
tion shows that the ratio of the variances is roughly 2/2. This ratio increases 
steadily with «,, and at the ends of the range the variance for the uniform 
spacing method exceeds that for the minimax variance method by a factor 
p + 1, while in the region of extrapolation this factor approaches 2 + p *. The 
crossover points for the two variance curves occur at +0.58 for the quadratic 
and +0.72 for the cubic. Thus over most of the region of interpolation the 
advantage lies with the uniform spacing method, but at the extremes of the 
region of interpolation and in the region of extrapolation the advantage lies 
decidedly with the minimax variance method. 

lig. | shows the shape of the two variance curves in the region of interpola- 
tion for the second and third degree polynomials. Since the curves are symmetri- 
cal about the origin of z, only half of each curve is drawn. 
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CONDITIONS THAT A STOCHASTIC PROCESS BE ERGODIC 
By EMANUEL PARZEN 
Stanford University 


In his work on statistical inference on stochastic processes, Grenander has 


pointed out ({2], p. 257) that “the concept of metric transitivity seems to be 
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important in the problem of estimation of a stationary stochastic process.”’ In 
this note, we give necessary and sufficient conditions in terms of characteristic 
functions that a strictly stationary stochastic process X(t) be metrically transi- 
tive or ergodic (see Doob [1], pp. 452-457 for definition of the terminology) 
More importantly, we state a mean ergodic theorem (or weak law of large 
numbers) for stochastic processes which are strictly stationary of order A, by 
which is meant that for every choice of K points 4, --- , tx , the random vari- 
ables X(t, + h), --- , X(tx + h) have a joint probability distribution which 
does not depend on h. 
THEOREM 1. Let the random variables X(t) be defined for ft in 
T = {0, +1, +2, --- }. 
Let K be a positive integer. Let 4, --- , tx be points in T. Assume that there 
is a characteristic function g(w , +--+ , wx) such that, for all wu, +--+, vx, 
(1.1) Efexpt{umX(t) + h) +--+ + ueX(te + h)}) = elu, +++, ux 
for all hin T. 

Assume that, for each 7 in 7, there is a characteristic function g(a, ++ * 5 Uk 37) 
such that 
(1.2) Efexpi{wa(X(4, + h) — X(t +h + 7))+ --+ + ux(X (te + h) 

— X(tk +h+7))}] = o(m,-++, uc; 7) for all Ain T. 
Let r = 1. Then for every Borel function g(a, --- , 2x) such that 

ER g( X(t), es X (tx)) ” eB, 


the sample means 


(1.3) BAe) = > 9 X(t, +h), ---,X(te +h)) 


n+ 1 h=0 


converge as a limit in r-mean. A necessary and sufficient condition that the 
limit of the J/,,(g) be the ensemble mean E(g) = Eg(X(t)), --- , X(tx)) is that, 
for all real w,--- , uk, 
] = 2 
(1.4) lim —— > elu, co° 5 Ug; 7) = | Plt, °°* , Ux) ]- 
noo 1 + 1 sm 

The meaning of these conditions is as follows: (1.1) states, in terms of char- 
acteristic functions, that the stochastic process is strictly stationary of order K; 
(1.2) states that the process of increments Y(t) = X(t) — X(¢ + 1) is strictly 
stationary of order A; (1.4) represents a very weak form of asymptotic in- 
dependence. 

From Theorem 1, together with the Birkhoff-Khintchine ergodic theorem 
(see Doob [1], pp. 464-473) we immediately obtain the following theorem. 

THEOREM 2: A strictly stationary stochastic process X(f) is metrically transi- 
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tive if, and only if, for every positive integer K, for any choice of K points 
t,,--+, tx, and for any real numbers u,,--- , ux, (1.4) holds. 

The conditions of Theorem 2 constitute a formulation in terms of character- 
istic functions of known conditions for metric transitivity (see Loéve [4], p. 435). 

As an indication of the power of these theorems, let us mention that with 
their aid one can readily establish the following statement made without proof 
in the book of Grenander and Rosenblatt ({3], p. 44): If X(¢) is a normal process, 
a necessary and sufficient condition for it to be ergodic (metrically transitive) 
is that its spectrum be continuous. If X(f) is a linear process, then it is ergodic. 

Theorem 1, and consequently Theorem 2, may be extended to the case of 
continuous parameter stochastic processes. They provide a new proof of the 
theorem of Maruyama (see [2], p. 257) that a continuous stationary normal 
process is metrically transitive if, and only if, its spectrum is continuous. 

Theorem 1 is very closely related to the weak law of large numbers for wide- 
sense stationary processes (see Doob [1], p. 489), from which it differs in that 
it does not require existence of second moments for X(f). 

The proof of Theorem 1 is fairly immediate. From (1.1), (1.2), and (1.4), it 
follows (either by the weak law of large numbers for wide-sense stationary 
processes, or directly by a simple argument [6]) that the theorem holds for 
trigonometric polynomials g(m,,---, Ze) = expi(ujr, +--+: + Urge). To 
extend the theorem to Borel functions g(z;, ---, zg) such that E|g|" < =, 
one uses the fact that to any e« > 0 one may find a trigonometric polynomial 
g.(%), *-*, Te) such that 


E | g(X(th), --- , X(txe)) — ge(X (th), --- , X(tx)) |" < 


In [5] one may find related theorems, including a discussion of convergence 
with probability one of certain sample means M,(q) of stochastic processes 
which are strictly stationary of order K. 
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POWER FUNCTIONS OF THE GAMMA DISTRIBUTION 


By Geraup D. Brernpt 


Advisory Committee on Weather Control 
Power functions are given for testing hypotheses on an increase in the mean 
u of a gamma variable. 


Let z be a random variable from a gamma population and let the frequency 
distribution of x be given by 


fo = f(x; B,v) = (BT (y)) "2" ‘exp (—x/8), o> @, 
=a 0, Ir ee 0, 


where 8 > 0 and y > O. If x then undergoes a scale change of the form x — 6x 
with 6 > 1, it is easily verified that the frequency distribution of dr is given by 


fi = f(éxr; 68, y) = ((68)"T(y)) ‘2” exp (—2/ 88), zr> 0, 
= Q, z S$ 0. 
Now in testing the null hypothesis Ho:y = Sy against the alternative hy- 


pothesis H,:y = 68y, 6 > 1 and specified, the probability of detecting the hy- 
pothesized change in the mean, or the power of the test, is given by 


r= | fi dz, 
z(a) 
where x(a) is such that 
x 
a= / fo dz ¥ 
z(a) 
and a is the significance level of the test. 
Curves of power functions of testing H» against H, are given for 


y = 3, 1(1)5, 7, 10(5)50, 
10= 6s 4.0, 
and 
a = 0.01, 0.05, and 0.10. 


For sufficiently large y, the distribution of z converges to the normal dis- 
tribution, and for many purposes the power of the test may then be evaluated 
by simply using the tables of the normal distribution function with standardized 
variates 


la = (x(a) — By) BY 7, 
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which is exceeded with probability a under [], , and 
te = (tay + By(1 — 8))/88BVy, 


which is exceeded with probability x under H, . The upper bound of the error 
in using the normal approximation to the gamma distribution for y 2 50 is 
calculated, by trial, to be 


sup | G(x) — N(x)| < 0.019, 


where G(x) is the distribution function of z as a gamma variable and N(z) is 
the distribution function of z as a normal variable. P 

Consider the following example of the use of the accompanying power curves. 
In illustrating the use of the power curves we first take note of a well known 
property of the gamma distribution. That is, if z,(i = 1, --- , n) are independ- 
ent random variables from gamma populations with parameters 8 and +; , then 
the sample mean is also a gamma variable with parameters 8/n and 


= > ont Yi- 


‘ 
See, for example, [1]. 


Suppose, for illustration, that a sample of size n = 10 is drawn, and that the 
r(i = 1,---, 10) are known to be independently and identically distributed 
gamma variables with y; = 2.0 and the same @ for each 7. It is desired to test 
Hoty = wo against Hy: yu = 1.549 = mw with probability a = 0.05 of accepting 
H, when in fact H, is true. What is the probability of detecting uw, ? Here 6 = 1.5 
and y = 20.0. In Fig. 2 we find 6 = 1.5 on the abscissa and move vertically 
to the point of intersection with the curve y = 20.0. The power, 2.5 = 0.598, 
is then the ordinate value at this point of intersection. 

How large a sample should be drawn in order that there is at least a prob- 
ability of 2.5 = 0.75 of detecting the specified increase 6 = 1.5 in po ? Inter- 
polating for the value of y at 6 = 1.5 and 2,5 = 0.75, we find y = 32. Hence 
the sample size should be at least n = 32/y; = 16 in this case. 

The calculations on which the power curves are based were made using 3- 
point Lagrangian interpolation in Pearson’s tables of the incomplete gamma 
function [2]. All calculations have been verified by actual integration of the 
gamma functions using high-speed computing machinery. This verification was 
carried out under the supervision of Dr. Max A. Woodbury at New York Uni- 
versity. 

I wish to acknowledge the work of Dr. Woodbury and his staff in making 
these calculations,'’and also to thank Elaine Berndt who performed all of the 
interpolations we have required and who drafted the accompanying figures. 
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New York, 1952. 
|2) Tables of the Incomplete Gamma Function, Karl Pearson, editor, Cambridge University 
Press, re-issued in 1951. 
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THE SMALL SAMPLE DISTRIBUTION OF nw’, 


By A. W. MARSHALL 
The RAND Corporation 


The asymptotic distribution of the statistic 


x 
9 


(1) nw, = 1 | [s,(z) — F(x)}° dF (x), 


=< 


where S,(r) is the sample cumulative distribution function (CDF), and F(z) 
the true CDF, is known and tabled [1]. Below are tabled some values of the 
CDF’s of nw, for n = 1, 2, and 3. Convergence to the asymptotic distribution 
appears to be extremely rapid. 


1. General considerations. It is well-known that: (A) the distribution of 
nw, is distribution free so that it is sufficient to treat the case where F(z) is uni- 
form on the interval [0, 1]; and (B) an equivalent form, especially suitable for 


computation from the ordered sample 2; S zz S +--+ S ay, is 
n oe e “9 
2 1 | 21 — 1 - ¥ 
(2) Ni, = — + »~ | es — F(z;,) 
12n “1 2m 3 





or for the case where F(z) is uniform [0, 1} 


(3) Nw, = : + Zz | a inn x]. 


12n i 2n 


As was suggested to me several years ago by Oliver Gross (3) clearly shows 
that the CDF of the nw, statistic can be evaluated rather easily for small n. 


The case n = 1 is trivial. For n = 2 one must evaluate the area in the inter- 
section of a circle with its center at z; = 3, 7. = 2 and a triangle with vertices 
at (0,0), (0, 1), and (1, 1). For n = 3 one must evaluate the volume in the in- 


tersection of a sphere with center at the point (4, 3, $) and the tetrahedron with 


vertices at (0, 0, 0), (0,0, 1), (0, 1, 1), (1, 1, 1). From (3) one also derives the 
result that nw; has a minimum value of 1/12n and a maximum value of n/3. 


2. Case A: n = 1. Since 


the CDF of «} is 


0, 2<Y5, 
F(z) = Pr [wi < 2] =4 (42 — 9), ty <2 <}, 
1, s> 4. 


3. Case B: n = 2. 


202 = de + [k — al + (2 — al’. 


Received June 25, 1957. 
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By evaluating the area common to a circle of radius (z — 1/24)! with center 
at (4, 2) and the triangle with vertices at (0, 0), (0, 1), and (1, 1), multiplying 
by two, the CDF F,(z) of the associated value of 2 w3 is obtained. The result is: 


- zi): 


sy: IR ae 147, _ 1/94) (z — 5/48)' 
z aE 4 Cos” 3(z — 1/24)* + rea). we 
| 8m _ 9 Cog 1/8)" = & - ee) 


24)| 2 z— 1/24 


1 1 ont css 1. 
ret 5(2 — 1/6) + 3 — 5/48)’, = z 


1, 


4. Case C: n = 3. This is the first complicated case and reduces to the prob- 
lem of evaluating the volume of the intersection of a sphere of radius 


(ze — 1/36)’, 
with its center at (%, 3, $), and a tetrahedron with vertices at (0, 0,0), (0, 0, 1), 


(0, 1, 1), and (1, 1, 1). Whereas in the case n = 2 there are five intervals over 
which F,(z) is separately defined, when n = 3 there are eight: (— ~, 1/36), 


TABLE I 
Values of the CDF’s of nw, for n = 1, 2, 3 and the asymptotic distribution at 
selected points 


F:(z) F;(z) ' P(s) 





.46692 .47343 . 50000 
.57614 .57683 . 60000 
.63384 .63009 .65000 
68842 .68521 70000 
.73974 -74191 . 75000 
.79126 -79924 . 80000 
-84515 .85481 85000 
. 90296 -90617 .90000 
. 94007 .93661 93000 
96554 .95723 95000 
. 98968 .97793 97000 
.00000 . 99680 . 99000 
.00000 1.00000 .99900 
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(1/36, 1/18), (1/18, 1/12), (1/12, 1/9), (1/9, 5/24), (5/24, 11/36), (11/36, 1) 
and (1, ~). Partial results for these intervals are as follows, where f = z — 
1/36: F3?(z) = 0; F3?(z) = Sat*?; FP(z) = 3x(3z — 1/9); F3°(z) = 3x(32 — 
1/9) — 2x[4r*? — ake + 274/27] + 6°? Vi, 4/6); --- ; F{?(z) = 1, where 
V(1, a) is the volume of the wedge-shaped segment of the sphere of unit radius, 
center at the origin, cut out by the two planes z = a and y = a. It is possible 
to obtain expressions in closed form for F;(z) over all of the eight intervals; 
however their derivation is tedious and the expressions complicated.’ A numeri- 
cal evaluation was therefore undertaken by the RAND Numerical] Analysis 
section. The result of these computations are shown in Table 1 along with the 
calculated values of F(z) for n = 1 and 2, and for the asymptotic distribution. 
The values of F;(z) appear to be off by one in the fifth decimal place. The rapid 
convergence to the asymptotic distribution, especially in the more interesting 
region of the tail of distribution, seems clear. 

One other piece of evidence, although of a much weaker sort, is available 
that suggests that the asymptotic distribution is a good approximation to the 
exact distribution for small n. A sample of 400 values of nw, was produced for 
the case n = 10. Grouping into twenty cells using the 5th, 10th, 15th, --- , 
percentage points of the asymptotic distribution gave the following cell entries: 
13, 19, 20, 18, 11, 21, 16, 18, 28, 17, 21, 22, 26, 18, 21, 16, 23, 25, 23, 24. Appli- 
cation of the x’ test gives a value of x* = 17.5. With 19 d.f. this value is ex- 
ceeded with probability of approximately .55. Application of the Kolmogorov 
test statistic, Sup | S,(z) — F(x)|, to the grouped data (for an approximate 
test) gives a value of 1.20. This value would be exceeded on the order of 11 per 
cent of the time under the null hypothesis. 

Anderson and Darling in one of their papers [2] mention that ‘empirical 
study suggests that the asymptotic value is reached very rapidly, and it appears 
safe to use the asymptotic value for a sample size as large as 40.”’ The results 
given above suggest the sample size for which it is reasonable to use the asymp- 
totic distribution is likely to be more nearly 3 or 4, or perhaps 5. 

For an allied form of the w’ test criterion, denoted by W,, in [2] and formed by 
adding to (1) the weight function y(X) = [F(X)(1 — F(X))]”’, an even more 
rapid convergence seems to occur. F; (z) = (1 — 4e~*”*)"” for the statistic Wi. 
Evaluating F;(z) at the 90, 95, and 99 asymptotic percentage points given in [2] 
yields .88716, .93292, and .98433 respectively. 


REFERENCES 
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LIMITING DISTRIBUTIONS OF HOMOGENEOUS FUNCTIONS 
OF SAMPLE SPACINGS' 


By LioneL WEIssS 


Cornell University 


1. Summary. Suppose 7, , T;,--- , T, are the lengths of n subintervals into 
which the interval (0, 1] is broken by (n — 1) independent chance variables, 
each with a uniform distribution on [0, 1]. Moran [1], Kimball [2], and Darling 
{3] have shown that if r is a positive number, then the asymptotic distribution 
of Ti + Tz + --- + T%, is normal. It is the purpose of this note to extend this 
result in two directions: more general functions of 7;,---, 7, are handled, 
and the joint distribution of several such functions is discussed. The proof is 
short and very simple. 


2. Notation and assumptions. As already indicated, T; , T2,--- , 7, are the 
n subintervals into which the unit interval is randomly broken. U,, Us, ---, 
U,, are independent chance variables, each with the density function e “ for 
u = 0, zero for u < 0. S, = U, + U2 +--- + U,.V; = U,/S, for it = 1, 

- ,n. It is known (and is very easily verified) that S, is distributed inde- 
pendently of (Vi, V2, --- , V.), and that the joint distribution of 


(Vi, Ve, eee Vn) 


is exactly the same as the joint distribution of T,, Tz, --- 
We are given k sequences of functions: 


(Gin(U1, Us, --- , Und}, +--+» (Gen(Ui1, Us, ---, Usd}, 


n = 1,2,---. These functions are assumed to satisfy the following conditions: 
(1) G;,.(U,,--- , U,) is homogeneous of order r; for all n, r; a positive quan- 
tity; 
(2) the joint distribution of 


Gin(Ui gee U,) shite Amn = Gi n(Ui ee a - Ayn 


ByV/n Bin 
approaches a k-variate normal distribution with zero means and covariance 
matrix C, say, as n increases. A,;,---, A, and B,,---, By, are positive con- 
stants. (The results hold for any values of A;,--- , A,. The assumption that 
they are positive is merely a convenience. ) 
We denote the element of C in row 7 and column j by ¢;; . 


3. The asymptotic distribution of G:,.(7,,---, T,), +--+, Gen(T1, +--+, Th). 


n/> 


THEOREM. Under the assumptions of Sec. 2, the joint distribution of 


-,T,) — Am M*Gan(T1,°+:,Tn) — Axn 
EE. 


Bin/n 


Received July 8, 1957. 
1 Research supported by the Office of Naval Research. 
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approaches a k-variate normal distribution with zero means and covariance matrix 


/ 


r,r; A, A;\ 


BB; 


) 
\ Ca; 
as n increases. 


Proor. By assumption, the distribution of the k-dimensional vector V(n) 
whose ith element is 
Gin(Ui, Cee ae U,) —_ Aw 
Byv/n 


approaches the k-variate normal distribution with zero means and covariance 
matrix C. We rewrite the 7th term of V(n) as 
y , r TT; l-r ‘ 1—; 
Gyn(Ui,-++,Un) — SJ! An * +S, An — An. 
Bw n 


Now S,/n converges stochastically to one as n increases; therefore the distribu- 
tion of the k-dimensional vector V’(n) whose ith element is 


Gy(U1 “eee U,) — S,'' Ayn a Ss, Aw: nee Am 


(J avs 





nm 


approaches the /-variate normal distribution with zero means and covariance 
matrix C. V’(n) may be written as the sum of two vectors, V;(n) and V,(n), 
whose ith elements are respectively 


n*Gin(Vi Sele V4) a Ayn 
Bw/n 


rytl 1-T 
Am—n*" A,S, * 


Ba/n 


We note that V,(n) and V.(n) are distributed independently of each other. 
Next we examine the distribution function, say F,(2,---, 2%), of V2(n). 


rg-l ie 6 
Pic,,°->,%) = py Ae — nt AS,” <2,;i=1,--: | 
Ba/n 


rc 


S, —_ ( F ht 
L Vn \ An — Vn Bz, ) 
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As n increases, the distribution of (S, — n)/+/n approaches the standard nor- 
mal distribution, by the univariate central-limit theorem. And for any fixed z, , 


L 
An — ke ri Ay 
as n increases. Thus, if Z deno’ s a chance variable with a standard normal 
distribution, F,(71 , --+ , 2%) approaches 


py A resi = 1,7 yk 
B 


t 
for each vector (a1, --- , x). 

Next, we denote by pi.n(4, °°: , &) the characteristic function of V,(n), by 
pen(li, °°: , t) the characteristic function of V2(n), and by pati, --- , t&) the 
characteristic function of V’(n). 

We have pa(ti, «°°, te) = pinllr,-°° , te) penlli,--*, te), OF 


pa (ti, --- , tk) 
oth, +--+. 4) w St... 
Pi, (ti, ’ x) po.n(tr yee. th) 


As n increases, 


pn(ti, +--+ , te) exp { -- LYS ests) 


2 1.j=1 


and 


Ponti, -++ , tk) exp ve 5 2s a 


i=] 


Z 


Therefore, as n increases, 


) 
Pin(ly, > +, tk) aexp< — 6 t; i| eu a Acar) 


BB 


This proves the theorem. 
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INSTANTANEOUS STATES 


ANOTHER COUNTABLE MARKOV PROCESS WITH ONLY 
INSTANTANEOUS STATES 


By Davin BLAacKWEL! 
University of California, Berkeley 


Let P be the transition function for a Markov process with a countable state 
space A and stationary transition probabilities; i.e., P is a nonnegative func- 
tion defined for all triples (a, b, 1) with ae A, b ¢ A, and ¢ a nonnegative real 
number, satisfying 


P(a, b,0) = lifa = b, Oifa ¥ b, 


7 P(a,b,t) = 1 forall a,t, 


b 


P(a,b,s +t) = ae P(a,c, s)P(c,b,t) forall s = 0, t => 0,a,b. 


ceA 
We shall suppose, as usual, that P is continuous at ¢ = 0; i.e., 
(4) P(a,a,t) — 1 ast— 0 for all a. 


It is well known that, for any P satisfying (1), (2), (3), and (4), P’(a, a, 0) 
exists for all a (it may be negatively infinite). Following P. Lévy [2], a state is 
called “instantaneous” if P’(a, a, 0) = —«. Examples of processes with all 
states instantaneous have been given by Feller and McKean [2] and by Do- 
brushin [1]. The purpose of this note is to describe a third example, somewhat 
simpler than those previously given. 

We first describe the process informally, after which we define P and verify 
(1), (2), (3), and (4) and P’(a, a, 0) = — © for all a directly. Let X,(#), X,(t), 
--+ be a sequence of Markov processes, independent of each other, each with 
two states 0 and 1. We suppose X,(0) = 0 for all n. Let X,(¢) be characterized 
by the parameters A, , wn : 


Pr {X,(t + h) | = Anh + oh), 
Pr {X,(¢ + h) = = prh + o(h). 


Our process X(t) will be the joint process X,(t), X2(t), --- which is clearly a 
Markov process. To insure that X(¢) has only a countable set of states, we 


Received July 11, 1957. 
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determine \,, , uw, so that, at each time ¢, with probability 1, X,(/) = 0 for all 
but a finite number of n. Since 


yr ¢ r n ’ Ke —(rA,.4+ yr 
Pr (X.(2) = 0| X.(0) = 0) = —“2— + e7 Ont 
bn + An fia Tt An 


Hn 
Mn + An 


this will oceur if 


~ @. 


de 


a Xn + Mn 


A state is instantaneous if and only if the probability of remaining in 
it throughout an interval is zero. Since the probability that X,(¢) = 0 through- 
out T, T + h given that X,(T) = Oise \nt the chance that the state X(T) 
with X,,(T 0 for n = N will persist throughout 7, 7 + A is at most 


oe 

aif ah thes 44+), 
Il « — nt+An 41 
“ 


and will be zero if 


(6) dda = oC 


Thus any choice of {A}, {un} satisfying (5) and (6) vields an example of a proc- 
ess with only instantaneous states. 
Formally, the set A of states is the set of all infinite sequences 


a= Ca 5 Sts 


of 0’s and 1’s with only finitely many 1’s. Let [\,}, !u,} be sequences of posi- 
tive numbers satisfying (5) and (6), let 


R,,(0, 0, t) —_.. sf An eaten 
Mn + An Ln + Pas 


; An Ln al a 
Ratt 1.b ss — gn Onin) 
Sk" mek 


R,(0, 1,t) = 1 — R,(0, 9, 2), 
R,(1, 0, t) = 1 — R,(1, 1, 0), 
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and define, for any two states a = (4, @&,--- ) and b = (6,, &,--- ) and any 
$20, 


(7) P(a,b,t) = Il a a 

n=l 
Denote by Ay the set of all states a = (4, @,--- ) withe, = Oforalln > N. 
For ae Ay and any M 2 N, we have 


> P(a,b,t) = halt) 2, UT ae 


beAy 


= hu(t) T] (Ralen, 0, t) + Ralen, 1,8) = h(t), 
1 
where 


= V M- 


(8 hy(t) = R,(0, 0, t — 
8) nan I ( 2 I Mn + An 


From (8), hu(t) — 1 as M — «, so that (2) is verified. For (3), say ae Ay, 
be Ay.ForM 2 N, 


7 P(a, c, 8) P(e, b, t) 


ceAy 


hu(s)hy(t) iz IIR. (en, On, 8)Ralan , bn, t) 


ay n=l 


hyu(s)hy(t) II (> Ralen » a, s)R,(a, bn ’ ») 
n=1 


M 
hu(s)hy(t) Il Rilo, in, 8 + t) > Plagb,s +t) as Mx. 
n=1 
For (4), ifae Avyand M 2 N, 


M 
P(a, a, t) = (II Bikes, He ») Vu, 
1 


lim inf p(a,a,t) = J 
t+0 


Since this holds for all M and Vy — 1 as M 
forae Ay and M = N we have, for all k = 


, (4) is verified. Finally, since, 


] 
M+ 

P(a,a,t) < hya(t) = IT R,(0, 0, t), 
M+ 


and since P(a, a, 0) = hyu(O) = 1, 
M+k 


P’(a,a,0) < hy.(0) = — po Nas 


M+1 


so that (6) implies P’(a, a,0) = — x. 
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SPACINGS GENERATED BY MIXED SAMPLES 


By Lionet Wetss! 
Cornell University 


1. Summary and introduction. Suppose X(1, 1), X(1, 2),---, X(1, m), 
X(2, 1), --+ , X(2, me), ---, X(k, 1),--- , X(k, m) are independent chance 
variables, X(7, 7) having the probability density function f,(z), for 7 = 1,--- , 
ny,? = 1,---,k. We assume that for each 7, f;(x) is bounded and has at most 
a finite number of discontinuities. We denote n; + neg + --- + m by N, and 
we assume that n;/N is equal to r;, where r; is a given positive number. Let 
Y, S Yo S --- S Yy denote the ordered values of the N observations 


ACh, B). 2+ 2. RR: ee). 


Define W, as Yi4n — Y; fori = 1,---,N — 1. For any given nonnegative (, 
let Rx(t) denote the proportion of the values W,, --- , Wy_, which are greater 
than t/N. Let S(t) denote 


fre (rifila) + refelx) +--+ + refe(x)) exp |—dinifilz) + --- + rife(x)]} dx 


and V(N) denote sup;z0| Ry(t) — S(t)|. Then it is shown that V(N) con- 
verges stochastically to zero as N increases. This is a generalization of [1], 
where k was equal to unity. The result is applied to find the asymptotic be- 
havior of ranks in a k-sample problem. 


2. Proof of the stochastic convergence of V(N). As in [1], if it can be shown 
that Ry(t) converges stochastically to S(t) for each positive f, the convergence 
of V(N) follows. Therefore we fix a positive value for 1. 

We define the chance variable Z(i, 7, N) to be equal to unity if no observa- 
tions fall in the half-open interval [(X(z, 7), X(7, 7) + t/N], and equal to zero 
otherwise. We denote 1/N )-f1>-}4; Z(i, j, N) by K(N). Clearly, 


K(N) = (1 — 1/N)Ry(t) + 1/N, 


Received July 17, 1957; revised August 26, 1957. 
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so our purpose is accomplished if we show that K(N) converges stochastically 
to S(t) as N increases. 
We denote f*, f(x) dz by F(z). 


E\ZG,3,N)\ = | E ~ F, (2 + x) + Fiz) | - 


mh 
‘II E —Fy (2 + x) + F(z) | dF (z). 
hei N 
But with the exception of a finite number of points, F(z + t/N) — F(z) can 
be written as [f/.(z) + (2, t/N)|t/N, where e,(z, t/N) approaches zero as N 
increases, for each x. Since f(z) is bounded (i = 1, --- , k), it follows easily 
that F{ Z(t, 7, N)} approaches 


ft. exp {—dnfilx) + --- + rfe(x)]} dF (x) 


as N increases. It follows immediately that E{K(N)} approaches S(t) as N 
increases. 
. - orm ‘ — k , 
Next we examine variance {K(N)}, which equals N~*)°f., >>}, variance 


‘Z(i, j, N)} +1/N?DYSD cov {Z(i, 7, N), Z(g, h, N)}. The first term in this 


(3,3) (oh) 
last expression clearly approaches zero as N increases, since there are N uni- 
formly bounded terms in the sum. We shall show that the second term also ap- 
proaches zero by showing that the covariances approach zero uniformly. Since 
there are N(N — 1) covariances, the factor 1/N? guarantees the approach to zero. 
If i # g, E{Z(i, 7, N)-Z(g, h, N)} is equal to 


if 1 E ~H (2 + x) + F(z) — Fs (y > x + Faw | 


z—v|>5 
ni-l 
, E — F, (2 + ‘) + F(z) — FP; (y + ‘) + Fa) | 


Rol 
. E — F, (2 + ‘) + F,(z) — F, (y + t) + Fw) | dF (zx) dF,(y). 
By computations similar to those used on E{Z(i, 7, N)}, it follows that 


E{Z(i, ds N)-Z(g, h, N)} 


approaches 


[ [ exp {—¢[nfilz) + --- + refi(x)]} -exp {-thifi(y) + --- + re fely)]} 


-dF (x) dF,(y) 


and from this it follows that cov {Z(z, 7, N), Z(g, h, N)} approaches zero as N 
increases. In the same way, it follows that 


cov {Z(2, 2 N), Z(i, h, N)} 
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approaches zero (j#h). Thus variance{AK(N)} approaches zero as N increases, 
so K(N) converges stochastically to S(t), as does Ry(t). Therefore we have 
shown that V(N) converges stochastically to zero as N increases. 


3. Application to ranks in k-samples. Define T(i, 7) as F;(X(i, j)). Then 
T(1, 1), --- , TQ, m:) have unform distributions. Let G,(2) denote the result- 
ing distribution function for T(z, 7). We assume that G(x) allows a density 
function g,(x) (then g(x) is zero outside the interval [0, 1], is bounded, and 
has a finite number of discontinuities). Let Vi S V2 S --- S Vw na, denote 
the ordered values of T(2, 1),---, T(k, m), and let Vo equal zero, Vy—n,41 
equal one. Let S; denote the number of T(1, 7)’s which lie in the interval 


Wis.¥4 fet Foust 


For each nonnegative integer r, let Q,(r) be the proportion of values among 
’ ’ ° . k 

Si,-++ , Sw-n,41 Which are equal to r. Define g(y) as > jae (r,/(1 — r))gi(y), and 
a as (r;/(1 — 7)). Define Q(r) as 


3 2 
~ q(y) 
“ | g \y) dy. 


Yo [a + gly)! 
Then it follows from the results above, using also the argument in [2], that 
supr>o|Qx(r) — Q(r) | converges stochastically to zero as N increases. This 
can be used to show that certain tests of the hypothesis 

F\(x) = F(x) = +--+ = F(z) 
are consistent. The discussion parallels that found in [2]. 
REFERENCES 
[1] L. Weiss, ‘‘The stochastic convergence of a function of sample successive differences,’’ 
Ann. Math. Stat., Vol. 26 (1955), pp. 532-536. 


[2] J. R. Buum anv L. Wess, ‘‘Consistency of certain two-sample tests,’’ Ann. Math. Stat., 
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CORRECTION TO “AN EXTENSION OF THE KOLMOGOROV 
DISTRIBUTION” 


By JeRoME BLACKMAN 
Syracuse and Cornell Universities 
1. Summary. It has been pointed out by J. H. B. Kemperman that an error 
in [1] invalidates the formulas arrived at in that paper. It is the purpose of this 
note to supply the correct formulas for the probabilities of Theorems 1 and 2. 
An Appendix by Professor Kemperman is included. 


Received February 18, 1957; revised August 14, 1957. 
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2. Introduction. The major error in [1] lies in the mappings on pp. 516-517. 
Corrections can be made for this but unfortunately the resulting formulas are 
more complicated than before. A smaller error appears in the statement that 
N(A2,) = N(B,,), but this is easily corrected. The new formulas are so much 
more complicated that it has not seemed worthwhile to correct Corrolaries 1 
and 2 which are hereby retracted. The corrected statements of the main results 
follow. 

THEeoreM 1. Left 2, %2,°** In, U1» 22 yt? ta. be a sequence of n(k + 1) 
independent random variables with a common continuous distribution F(x). Let 
F(x) and G,,(x) be empiric distributions based on the first n and second kn ran- 
dom variables respectively. Then 


P(—y < Gus) — F,(s) <x forall s) 


(Kk —l1 ‘ i ‘ 5 
=1- (“ 7 ~" D {N(As1) + N(By1) — N(Ax) — N(B2)}, 
i=l 
where the N functions are given in (1), (2), and (3). 
THEOREM 2. 


P(-y < F(s) — F,(s) < x forall s) 


x 


=1— > {N(Ax-1) + N(By-1) — N(As,) — N(B:,)}, 


i=) 
where the N functions are given in (5), (6), and (7). 


3. Corrections. The point of departure from [1] is the middle of p. 516 where 
a formula for N(@») is given. It is readily seen that upon dividing this equation 


: (k + 1)n — 
by the total number of paths , one obtains Theorem 1 except for 
r 


the analytical expressions for the N functions. We will also use the mapping 
of the A, and B; classes described at the bottom of p. 516, although the con- 
clusions drawn there about the mapping are incorrect. The error is clear if we 
consider the image of a path from A; under the mapping. The image will be a 
path which starts from the origin, reaches 2a + 8, and then on the return to 0 
stops at least once at the point a. The class A; will be in 1:1 correspondence 
with the set of paths which starting at 0 reach 2a + 8 and then on the return 
stop at least once at a. Because the steps to the left are of length *, not every 
path which reaches 2a + £8 will, at some later step, stop at a. In Table 1 the 
images of the A; and B,; under the mapping are given. The second column gives 
the points which the path must reach and the last column gives the points at 
which the path must stop, in order, after reaching the point described in column 
2. In all cases the mapping is 1:1 between the class in the first column and the 
set of paths which reach the point indicated in the second column and subse- 
quently stop in order at all the points indicated in the last column. 
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TABLE 1 





i(a + 8) — 8 (ti — l)\(a + 8) — B, (it — 2)(@a + B) — B,--- a 


t(a + B) | é= 1)(a + 8B), (i — 2)(a + B), --- (a + 8B) 
t(a + B) —a (i — 1)\(a + 8) — a, 


Bu | t(a + B) | t(@ + B) — B, a = le +8): — 6, +> 0 


As a preliminary step consider the number of ways a path consisting of 7 
steps to the left and ki — a steps to the right can go from a@ to 0 without touch- 
ing a after the first step. Let the number of these paths be (7). While this 
number can be computed by elementary methods a more elegant formula has 
been obtained by Professor Kemperman, namely, 


ia (a (D+ -1 
Halt) = (k + 1) véolthaass (a — r)(k + i)-1 ( ) 


(° -l1l- " iter 
r (k+lji-—« 


The proof of this is contained in the appendix. 
The number of ways of going from 0 to @ after exactly j steps to the left and 
kj + a@ steps to the right will be indicated by J(a, 7) where 


(2) J(a,j) = Y + Dj + 7 


(1) 


J 


Combining the results of Table 1 and the definitions of 7 and J we see that 


N(Ay-1) = > J(ila + B) — B, ji) I] Ha+sl jx) Hal jis), 


JitessFI641" 


t+1 


N(Ay) = >. J(ila + 8), j:) [] Harel je), 
Jit-sst+ii4¢1=" km? 


(3) 
N(Bya) = 2 Sila + 8) — a, ju) TT Harel ju) Holds), 


Fitess+5¢41=0 
t+1 
N(B,,) = DD  Jila + B), jr)Ha( jo) [] Hose je) Hal jess). 
Jit-ss*+5i4g=n k=3 
This completes Theorem 1. The infinite series occurring in this theorem is 
really a finite series in view of N(A2;-:) = --- = N(B2;) = 0 For 
i > nk/(a + 8B). 


‘To get Theorem 2 it is only necessary to take the limit as / —> © in the various 
formulas given above. By Stirling’s formula, 
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. ( E ; 
(4) ¥ — “ OU +k) as ko, 
Here and below we will use a, = 0(b,) to mean limy., a/b, = 1. 
Using (4) and a few more applications of Stirling’s formula and remembering 
that a = —[—zkn] and 8 = —[—ykn], we obtain 
G — an — 9 oma 
H(i) = 4 > (-1) cael oe — an(i — an)**/i!> 
(5) _ OSr<zn (i — r)ir! 


-O(11 + k)') = AAWDO( + k)') 


where the last equality defines H,(7). Using (4) again 


(6) J(a,j) = :. (j + 2n)’O((1 + k)’) = I(x, JOA + k)’) 
J: 
and 
(k - =e ” 
(“ + ~ ~ = OG +8). 
n n” 


(Combining these results and (3) the following equations are obtained: 


AL. =i 
lim P — N(Ay-1) 


kon 


- 7 J(i(z + y) — y, jv) IT Aesy( je) A (js41) 


Jitesst3i41=" 


= N(Axy-1), 
: -1 
lim - + ~ N(Ax,) 
n 


koa 


t+1 
= > J(i(x + y), jr) IT FAasy( ju) 


Sites i4i=" 


= N(An), 


. nt 
lim r x 7 N(B,,-1) 


Kou 


- YL IKer+y—-zh I Breve je) Ay jess) 


Site+-+ie¢2—" 


N (Bo), 


‘. —l 
on r - ”) N(Bo,) 


koe 


= Il J(i(x + y), jx) Aye) II FAasy (je) A Gi42) 


Jitestii4e=n 
= N(B;,). 


This completes Theorem 2. 
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Attention should be drawn to a paper of Korolyuk [2] wherein the author 
gives different versions of the probabilities we have presented for the case 
r= y. 
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APPENDIX 


By J. H. B. KempeRMAN 


By a path of length n we shall mean an ordered sequence of n + 1 integers 
(2), --- , én), such that 
2-24-42 —1 i ] n) 
For each path 7, = (20, -°-* , Zn), let 


P(ws) = TI ple; — %-1), 


i=] 


’ 


(the weight or ‘‘probability” of z,). Here, the p; = p(z), (« = —1,0, +1,--- 
denote given (real or complex) numbers, p(—1) # 0. Finally, let 


e,(n) = 7 p(n), 


Tn 


the summation being extended over all the paths 7, = (Z0, --+ , Zn) With a = 0, 
Zn = 2,2; ¥ 202 = O,1,---,n — 1). 
THeoreM. Forn = 1,2,-:-, 


e(n) = —zr,(n)/n + Do jG + Lp; 
(8) j=1 
> r.(—m)r_j(m+n—1)/(m+n-— 1). 
O0<mstz 
Here, for arbitrary integers h and s, rs(s) is defined as the coefficient of w*** in the 
formal development 


(pr + pow + piw +... -) = > r,(s)w'**; 
especially, r(s) = Oifh +s <0. 


Proor. Let n and z be given integers, n = 1. For any path (a, --- , 2,) with 
2) = 0, 2, = 2, we have 


2; z1=z—- (2-2-1) S 


<z+n-1l, 
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(2 = 1,---,n — 1), thus, e,(n) does not depend on the p; with i 2 n + z 
Further, r,(s) does not depend on the p,; with i 2 h + s, hence, the inner sum 
in (8) does not depend on the p; with i 2 n + z; moreover, the jth inner sum 
equals 0 when 7 2 n + z. Consequently, it suffices to prove the theorem for the 
special case that p, = 0 for 7 sufficiently large. 

In this case, 


x 
fw) = DY pw' 


i=—1 


is analytic at each point w # 0. Further, for | w! sufficiently small 


x 

(9) f(w)’ = = r,(s)w", 
—es 

hence, for s = 0 


r,(s) = y P(x,), 


summing over all the paths x, = (z,---, z,) with z = 0, z, = h. Observing 
that to each path (z,--- , Z,) with z, = z there corresponds a unique integer 
mwithO S m S n,z; ¥ 2(t = 0,1,--- ,m — 1), zm = 2, it follows that 

r,(n) = >, e,(m)r(n —m) (n = 0,1, 

m= 
hence, 
(10) E, = R, Ro, 
where 
x x 

(11) R, = > ni(n)t", E, = > e,(n)t", 


n= n=U 
t denoting a sufficiently small parameter, ¢ + 0. 


Further, from (9), for each integer h, 


R+ Dd nndt" = z r,(n)t" 


—hen<0 n=—h 








’ 


n=—h 2rv/ —1 2 ivl=e ary a jwj=z W— peg 


where R denotes a fixed positive number with f(w) ~ 0 for0 < |w| S R. 
Here, from p(—1) # 0, the integrand is regular at w = 0. Moreover, for ¢ ~ 0, 

t| sufficiently small, the equation f(¢) = f° has a unique solution satisfying 
0 < || < R. Thus, 


R= (—4f'(©@)"E* -— Do on(— mt” 


O<msh 
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Finally, (10) and 


ef(t) = ef'(8) +f) — 0 = —t° + DY G+ Dpie’ 


j=0 
imply 


B,=8*+(-1+t2 G+ pet) Le n(— me. 
j=0 O0<msz 
In view of (11), it suffices to prove that, for each integer h and | ¢ , sufficiently 
small, ¢ # 0, 


~~ 
-—h \ / 
e*=-h DS r(m)t"/m+e, 


m=—h 


mxo 


where c, denotes a constant. Now, for ¢|,|& small, the mapping ¢ — & de- 
> °/ I « : . » ° ‘ ° 

fined by f(£) = f is a 1:1 analytie transformation. Hence, integrating along a 
small positively oriented circle about 0, we have, for m # 0, 


lye = i i — ‘ h 
le Cr a= — le d(f(é)"/m) = — — If(e)"= d§ = —2rvV/ —1—1n(m). 
m « m 
teMARK. Results and methods analogous to the above may be found in the 
paper ‘The passage problem for a stationary Markov chain” by J. H. B. Kem- 
perman, to appear in these Annals. 


Let & be a fixed positive integer and choose p(—1) = p(k) 1, pl?) 0 
fort # —1, k. Then e,(z) is equal to the number of sequences (2), +++ , Zn) 
with 2; — zi1 = —lort+k 

P= L--:  g a= 424-2 2432 d O,1,--:-,2 — }). 
Further, /7(z7) is equal to the number of sequences (z,, Zn-1,°°* , 20) With 
n= —-at+iak+1) 21, Z,—-241= —lork 

est ho me) ee a,Zz = 0,2; ¥ a(t = 0, --- ,n — 1). Hence, 


H(i) = eg(—a + t(k + 1)) 
and the above Theorem yields 


H(i) = —ar,(n)/n + k(k + 1) Zz ral — m)ru(m+n — 1)/(m+n — 1), 


O0<msa 


where n = —a + i(k + 1). Noting that ra(s) is equal to the coefficient of w**’ 
* about 0, formula (1) easily follows. 


. . k+1 
in the expansion of (1 + w'™) 
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(Abstracts of papers presented at the Los Angeles Meeting of the Institute, December 27-28, 1957) 


1. Non-parametric Multiple-Decision Procedures for Selecting That one of K 
Populations Which has the Highest Probability of Yielding the Largest 
Observation. (Preliminary Report) Roperr Becuorer, Cornell Uni- 
versity AND MILTON SoBEt, Bell Telephone Laboratories. (By title) 


Let X, be chance variables with density function f;(z), and let 
pi = Prob {X; > max;z; X;}(¢ = 1, --- , k). 


Then Din p: = 1. Let pay S --- S Pix) denote the ranked p; . Let 6*, P* (1 < @* < «=, 
l/k < P* < 1) be specified constants. The goal is to select the population associated with 
pix} ; the procedure must guarantee, (*) Prob {Correct Selection | pu; 2 @*pu-sy} 2 P*. 
Procedure: ‘‘At the mth stage take the vector-observation x, = (Zim, +++ , Zim) Where the 
ri; QQ = 1, 2,---) are independent observations from the ith population. Consider y, = 
(Yim 5 *** » Yem) Which is obtained by replacing the largest component of x» by unity, and all 
other components by zero. Then y,, is an observation from a multinomial distribution with 
probability p; associated with the ith component (i = 1,2, --- , k). (*) now can be guaran- 
teed by continuing with procedures already proposed, e.g., these Annals, Vol. 27, p. 861. 
If fi(z) = gi (x — wi) /6} (¢ = 1, 2,--- , &), then the procedure can be used for selecting 
the population associated with the largest wu; for any 6, known or unknown. Similar non- 
parametric procedures in which pairs of observations are taken from each population at 
each stage of experimentation, and which employ the range of each pair can be used for 
selecting that one of k populations which has the highest probability of yielding the largest 
sample range. If f;(z) = h{ (x — wi) /6;} (¢ = 1,2, --- , k), then these latter procedures can 
be used for selecting the population associated with the largest 6; for any set of »; , known 
or unknown. (Research supported in part by the U. S. Air Force through the Air Force 
Office of Scientific Research, ARDC, Contract No. AF 18(600)-331.) (Received September 
25, 1957.) 


2. The Asymptotic Efficiency of Friedman’s x:-test. Pu. van ELreren, Math- 
ematical Centre, Amsterdam. (By title) 


Let F(z) be a continuous cdf with density function f(z) = F’/z) and let 
Spe (up = 1,2, °°, msn = 1,2,--- , 2) 


be a chance variable with distribution F,,(z) = F(z + 6, + n,). It is assumed for con- 
venience, that ©, @, = 0. Friedman (1937) has constructed the x?-test for the hypothesis 
6: = 62 = +++ = 6, = O (J. Amer. Stat. Assn., Vol. 32, pp. 675-699). For alternatives 
6, = Om = 5,/+/m, where the 4, are given constants satisfying >, 6, = 0, the asymptotic 
relative efficiency for m — ~ in the sense of Pitman of Friedman’s test with respect to the 
corresponding 2-way-analysis of variance test is found to be e, = 12n(n + 1)“ fof f2(2) dz}?, 
where o? is the variance associated with F(z). If f(z) is normal, e, reduces to e, = 
3n/x(n + 1). (Received August 19, 1957.) 


3. Experiments With Mixtures. Henry Scuerrf, University of California. 


Experiments with mixtures of g components are considered, whose purpose is the em- 
pirical prediction of the response to any mixture of the components, when the response 
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depends on the proportions x; , 2 , --- , 2, of the components present but not on the total 
amount. The factor space is then the (¢ — 1)-dimensional simplex where z; + --- + z, = 1, 
x; = 0. An experimental design called the simplez lattice and some modifications are treated; 
in the simplex lattice z; = 0,1/m,2/m,--- ,1 fori = 1, --- ,g and some positive integer 
m, and the responses of all mixtures possible with these proportions are observed. The usual 
resolution of the response into general mean, main effects, and interactions does not seem 
possible, and so polynomial regression is employed. The problem of fitting an nth degree 
polynomial in z; , --- , z, to the response is complicated by the fact that different poly- 
nomials give the same function on the simplex. Useful canonical forms are developed for 
n & 3. The coefficients in these forms are interpreted as various kinds of synergisms. The 
analysis of experiments with these designs leads to classes of polynomials orthogonal on 


the lattices. The paper will appear in J. Royal Stat. Soc., Series B. (Received October 25, 
1957.) 


4. Least-Squares Estimation when Residuals are Correlated. M. M. Srppiqut!, 
University of North Carolina. 


Let y;,j = 1,---, N be observations on a variate and let y; = Di Bias; + A;, 
j =1,2,--- ,N, where z;; are non-stochastic, and A’ = (A, , --- , Avy) isa N(O, oP) vector, 
where 0 is a zero vector and Pisan N X N correlation matrix. Using the usual least-squares 
estimates, b; , of 8; which are obtained by minimizing Aj , and s? of o?, the covariance 
matrix of b; is obtained for general P and bounds are set on these covariances by first ob- 
taining the maxima and minima of a quadratic and a bilinear form u’Au and u’Av where 
u and v are N X 1 vectors and A is an N X N real symmetric matrix under the conditions 
u’'u = v'v = 1,u’v = 0. (Received October 31, 1957.) 


5. A Property of Additively Closed Families of Distributions. Epwin L. Crow, 


Boulder Laboratories, National Bureau of Standards. 


Consider a one-parameter additively closed family of univariate cumulative distribution 
functions F(z; \) (H. Teicher, Ann. Math. Stat., Vol. 25 (1954), pp. 775-778). Let three 
cumulants with orders in arithmetic progression exist and be non-zero. If all three orders 
are even, or if the first order is odd, it is also required that F(z; 4) = 0 for z < 0 and 
F(z; ) > 0 for z > 0. Consider linear combinations, with real, non-zero coefficients, of 
a finite number of independent variables with distributions in the family. It is proved that 
the only such linear combinations whose distributions are also in the family are those with 
coefficients unity. The additively closed families having this property may be called strictly 
additively closed. It can be shown that (one-parameter) additively closed stable families of 
distributions (normal and Cauchy in particular) with characteristic functions continuous 
in \ are not strictly additively closed, while Poisson, generalized Poisson, binomial, and 
gamma families are strictly additively closed. (Received October 31, 1957.) 


6. Determining Sample Size for a Specified Width Confidence Interval. 
FRANKLIN A. GrayBILL, Oklahoma State University. 


If an experimenter decides to use a confidence interval to locate a parameter, he is con- 
cerned with at least two things: (1) Does the interval contain the parameter? (2) How 
wide is the interval? In general the answer to these questions cannot be given with absolute 
certainty, but must be given with a probability statement. The problem the experimenter 
then faces is: The determination of n, the sample size, such that (A) the probability will be 
equal to a that the confidence interval contains the parameter, and (B) the probability will 
be equal to 8? that the width of the confidence interval will be less than d units (where a, 
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8*, and d are specified). To solve this problem will generally require two things: (1) The form 
of the frequency function from which the sample of size n is to be selected; (2) Some previous 
information on the unknown parameters in the frequency function. This suggests that the 
sample be taken in two steps; the first sample will be used to determine the number of 
observations n to be taken in the second sample so that (A) and (B) will be satisfied. For a 
confidence interval on the mean of a normal population with unknown variance this problem 
has been solved by Stein for 8? = 1. In this paper a theorem is proved which gives a method 
for determining n so that (A) and (B) will be satisfied. The theorem holds for parameters 
in the normal distribution and other distributions as well. (Received October 30, 1957.) 


7. Nonparametric Estimation of Sample Percentage Point Standard Deviation. 
Joun E. Watsu, Lockheed Aircraft Corporation. 


The available data consists of a random sample zr(1) < --- < z(n) from a reasonable 
well-behaved continuous statistical population. The problem is to estimate the standard 
deviation of a specified z(r) that is not in the tails of the sample. The estimates examined 
are of the form a[z(r + 1) — z(r — 7)] and the explicit problem consists of determining 
suitable values for a and 1. The solution a = (1/2)(m + 1)-*{[r/(n + 1)][1 — r/(m + 1)]}'” 
and 1 = (mn + 1)*® appears to be satisfactory. Then the expected value of the estimate 
equals the standard deviation of z(r) plus O(n~*"); also the standard deviation of this 
estimate is O(n~*°). That is, the fixed and random errors for this point estimate are of the 
same order of magnitude with respect to n. Solutions can be obtained which decrease the 
order of one of these types of error. However, these solutions increase the order of the other 
type of error, so that the over-all error magnitude exceeds O(n~*'""). (Received November 
7, 1957. 


8. On the Structure of Distribution-Free Statistics. C. B. Bett, Xavier Uni- 
versity of Louisiana and Stanford University. 


Let XY, ,--- , X, be a sample of a one-dimensional random variable X which has con- 
tinuous cpf F’. It has been observed that the distribution-free statistics commonly appear- 
ing in the literature can be written in the form #[F(X,), --- , F(X,)], where ® is a meas- 
urable symmetric function defined on the unit cube. Such statistics are said to have struc- 
ture (d). In establishing that having structure (d) is equivalent to being symmetric and 
strongly distribution-free for properly closed, symmetrically complete classes of cpf’s, 
this paper extends a result of Birnbaum and Rubin while employing different methodology. 
These results interest a statistician because (1) they indicate that one should construct a 
statistic of structure (d) whenever one wishes to design a distribution-free statistic; and 
(2) they guarantee that each symmetric, strongly distribution-free statistic is of structure 
(d), and, hence, that the value of its cpf at any point is the volume of a polyhedral region 
in the unit cube. Under such circumstances the work of numerous statisticians indicate 
that it should be possible to evaluate the cpf explicitly; reduce it to recursion formulae; 
tabulate it with high-speed computers; or evaluate its limiting distribution. (Received 
November 7, 1957.) 


9. On the Supremum of the Poisson Process. RonaLp Pyke, Stanford Uni- 
versity. 


Let {X(t); ¢ 2 0} be a Poisson process (with shift) for which log E(e’*'”) = — itwa + 


e f su —" 
At(e™ — 1), we Ri ,a, dX > O. Define o(z, 7) = Pri “ Pa T X(t) S x). Let Xi, Xe, -- 


X, be the ordered random variables of n independent and uniform — (0, 1) random 
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. . ae j : . max : ‘ , ; 
variables. The distribution function, Py ; (ai — X,) S xz), is obtained for all 
(kas Sn j 


a,xeR,. Fora = 1/n, this reduces to the distribution function of D} (cf. Birnbaum and 
Tingey, A.M.S., Vol. 22). Utilizing this result, ¢(z, 7’) is obtained explicitly. Applications 
of these expressions to queueing theory and distribution-free statistics are given 


10. On the Distributions of Various Sums of Squares in an Analysis of Variance 
Table for Different Classifications With Correlated and Non-homogeneous 
Errors. B. R. Buar, Karnatak University. (Preliminary Report) (By 
Title) 


The distributions of various sums of squares in an analysis of variance table for two way 
classification have been obtained by Box (Ann. Math. Stat., Vol. 25, pp. 484-498) under the 
assumption that the vectors X_; for 7 = 1, 2,--- q are independent vector observations 
from a p-variate normal population with mean yw and covariance matrix . The vector X ,; 
for each jth level of the facter B denotes p observations corresponding to the p levels of 
the other factor A. This paper gives the distributions of the various sums of squares for 
any n-way classification under similar normality assumptions. It is noted that these dis 
tributions, in general, follow a simple pattern and so is their mutual dependence. For n = 
3, if we have a third factor C at r levels in addition to the above factors A and B and if we 
assume that Y_, fork = 1,2, --- rare independent vector observations from a pq-variate 
normal population, then, according to the general pattern the first set consists of distri- 
butions of the sums of squares for the correction term, main effects A and B and their 
interaction A B. The second set (the only remaining set) consists of the distributions of the 
sums of squares for the remaining main factor C and its interactions with the effects in the 
first set. Any two distributions, not belonging to the same set are independent, whereas, 
the distributions in the same set are mutually dependent. (Received May 10, 1957) 
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NEWS AND NOTICES 


Readers are invited to submit to the Secretary of The Institute news items of interest 


Personal Items 

During 1957-58 T. W. Anderson will be a Fellow at the Center for Advanced 
Study in the Behavioral Sciences in Stanford, California. 

Dr. Robert 8. Aries is now Chairman of the Board of Aries Associates, whose 
offices for general consultation were recently transferred to 77 South Street, 
Stamford, Connecticut. 

John Bailey has recently joined the staff of the Waltham Laboratories of 
Sylvania Electric Products, Inc., as an Engineer in their Applied Engineering 
Department. 

Colin R. Blyth, on leave from the University of Illinois, will be at Stanford 
University for the academic year of 1957-88. 

John V. Breakwell has taken a position as Staff Scientist with the Lockheed 
Missile Systems Division in Palo Alto. 

D. M. Brown is now studying for the degree of Ph.D. in statistics at Prince- 
ton University on a RAND Corporation Fellowship. 
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Charles R. Carr has joined the research staff of The RAND Corporation at 
Santa Monica, California. 

Richard L. Carter has received his Ph.D. degree in statistics from the Univer- 
sity of North Carolina and has been appointed Associate Professor of Industrial 
Engineering at the Illinois Institute of Technology. 

Jonas M. Dalton completed his work for his Master’s degree in statistics at 
Virginia Polytechnic Institute in June, 1957. He is now employed at the Bell 
Telephone Laboratories, Murray Hill, New Jersey. 

Morris H. De Groot has been appointed Assistant Professor in the Department 
of Mathematics at Carnegie Institute of Technology. 

R. F. Drenick has joined the Bell Telephone Laboratories as a member of its 
technical staff. 

Joseph Dubay is now an instructor in the Mathematics Department of the 
University of Oregon. 

Professor Benjamin Epstein of Wayne State University is on leave at the 
Department of Statistics, Stanford University. 


L. A. Gardner, Jr., has resigned his position as research scientist at Columbia 
University’s Hudson Laboratories and is now employed as staff mathematician 
at M.I.T. Lincoln Laboratory. 

John J. Gart is now a graduate fellow at the Oak Ridge Institute of Nuclear 
Studies, continuing work there toward a Ph.D. in statistics from V.P.1. 

David W. Gaylor has resigned from the Nuclear Aircraft Research Facility, 


Convair, Ft. Worth, to work toward a Ph.D. in experimental statistics at North 
Carolina State College. 

R. Gnanadesikan is now working with the Statistics Group at the Proctor 
and Gamble Company at Cincinnati, Ohio. 

William A. Golomski, formerly Assistant Professor of Mathematics at Mar- 
quette University, is now in charge of operations research for Oscar Mayer and 
Company, Inc., Madison, Wisconsin. 

Roe Goodman has gone from Santiago, Chile, where he was F.A.O. agricul- 
tural statistician, to Karachi, Pakistan, where he is now sampling statistician 
under the I.C.A. program of the U. S. Government in that country. 

Ulf Grenander has accepted an appointment as Professor of Mathematical 
Statistics at Brown University. 

Irwin Guttman is on leave of absence from the University of Alberta and will 
spend the academic year of 1957-58 as a Research Associate in the Department 
of Mathematics, Statistical Section, of Princeton University. 

Following completion of assignment for Remington Rand Internation (in- 
stallation of UNIVAC I at European Computing Center, Frankfurt/Main. 
Germany), Dr. Carl Hammer has accepted a similar position with Sylvania 
Electric Products, Inc., at their Waltham Laboratories. 

Gordon M. Harrington has left his position as consultant in research, Connect- 
icut State Department of Education, to become Associate Professor of Psy- 
chology and Department Chairman at Wilmington College, Wilmington, Ohio. 
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Dr. Theodore W. Horner, formerly of the Statistical Laboratory, lowa State 
College, is now an Operations Research Analyst with General Mills, Inc., Min- 
neapolis, Minnesota. 

Patricia A. Inman received her M. A. in mathematics at U.C.L.A. in August, 
1957, and has accepted a position as Computing Engineer at Atomics Inter- 
national, Canoga Park, California. 

James Edward Jackson has taken a leave of absence from the Eastman Kodak 
Company to work on a doctorate at V.P.I. in Blacksburg, Virginia. 

Trinidad J. Jaramillo is now working with System Research, University of 
Chicago, as Senior Research Engineer. 

Peter W. M. John has accepted a position as research statistician with the 
California Research Corporation at Richmond, California. 

Andre G. Laurent, formerly with the Department of Statistics of Michigan 
State University, has accepted an appointment as Associate Professor in the 
Department of Mathematics of Wayne State University. 

J. Walter Lynch has moved from Huntsville, Alabama, to 101-L Rodman 
Road, Aberdeen, Maryland. 

Albert Madansky has been employed by the Mathematics Division, The 
RAND Corporation, Santa Monica, California. 

B. Mandelbrot, formerly of the University of Geneva, has accepted an ap- 
pointment in the University of Lille. 

Mr. C. L. Marcus has returned to the University of Illinois to complete his 
studies for a Ph.D. in statistics. He also holds an Assistantship in the Mathe- 
matics Department and a Fellowship from Armour Research. 


Dr. H. P. Mulholland has been appointed to a Senior Lectureship in Mathe- 
matics in the University of Exeter. 


Peter Newman has taken a post as Lecturer in Economics University College 
of the West Indies, Mona, St. Andrew, Jamaica, B.W.I. 

Dr. Bernard Ostle. formerly Professor of Mathematics and Director of the 
Statistical Laboratory at Montana State College, is now with the Reliability 
Department of Sandia Corporation, Albuquerque, New Mexico. 

Dr. Raymond P. Peterson has accepted a position as Mathematical Statis- 
tician with the Research Department, Matson Navigation Company, San Fran- 
cisco, California. 

Paul H. Randolph has resigned his position as Assistant Professor of Industrial 
Engineering at Illinois Institute of Technology to accept a position as Associate 
Professor of Industrial Engineering at Purdue University. 

Dr. George J. Resnikoff, formerly Research Associate at the Applied Mathe- 
matics and Statistics Laboratory, Stanford University, has joined the Industrial 
Engineering Department of the Illinois Institute of Technology as Associate 
Professor. 

Robert H. Riffenburgh received his Ph.D. degree in statistics at the Virginia 
Polytechnic Institute and is now Assistant Professor of Mathematics at the 
University of Hawaii, Honolulu. 
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Joseph Rosenbaum has resigned his position as Associate Mathematician, 
Systems Development Division, The RAND Corporation, and is presently em- 
ployed as statistician, Broadview Research Corporation, Burlingame, Cali- 
fornia. 

Dr. Jagdish 8. Rustagi is an Assistant Professor in the Department of Statistics 
at Michigan State University, East Lansing, Michigan, for the current academic 
year. 

Melvin E. Salveson has formed the Council for Advanced Management to- 
gether with Herbert Holt, M.D., and is offering services for research in manage- 
ment and management education. 

Dr. M. M. Siddiqui, who obtained his doctorate degree from the University 
of North Carolina in June, 1957, is now working for a temporary period with 
the Boulder Laboratories of the National Bureau of Standards in the capacity 
of Mathematical Statistician. 

Professor Jack Wilber has returned to Roosevelt University after spending 
four months as Consultant to the Operations Analysis Office at the Air Force 
Missile Test Center. 

Morris Skibinsky has returned to the Statistical Laboratory at Purdue after 
a year’s leave of absence at Michigan State University. 

James H. Stapleton received his Ph.D. degree in mathematical statistics from 
Purdue University in June, 1957, and is now a statistical consultant in the 
statistical methods section of General Electric’s General Engineering Laboratory 
in Schenectady, New York. 

Daniel Teichroew has joined the Graduate School of Business at Stanford 
University as Associate Professor of Management. 

James E. Thompson has returned to his job as mathematician with the De- 
fense Department, having completed a year of graduate study with the Depart- 
ment of Statistics at Stanford University on a Defense Department fellowship. 

W. A. Thompson, Jr., has left the U.S. Air Defense Board and has accepted 
an academic position at the University of Delaware. 

Dr. Fred H. Tingey, mathematician, has been appointed by TrcHNicaL 
OPERATIONS, INCORPORATED, as Assistant Chairman of Experiment Planning 
and Execution, Dr. Ian W. Tervet, Director of the research and development 
firm’s West Coast office, announced today. Dr. Tingey received his masters and 
doctoral degrees in Mathematics and Mathematical Statistics at the University 
of Washington. He graduated from Utah State College in 1947. In his new posi- 
tion, Dr. Tingey will plan and direct field experiments conducted by TECH- 
NICAL OPERATIONS in conjunction with the Combat Development Ex- 
perimentation Center (CDEC) at Fort Ord, California. 

John A. Tischendorf, having completed two years of active duty with the 
Commissioned Corps, U. 8. Public Health Service, has joined the staff of the 
Allentown Laboratory of Bell Telephone Laboratories, Inc. 

Joseph B. Tysver, formerly an Associate Research Engineer at the University 
of Michigan, received his Ph. D. degree from that University in June, 1957. 
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and has accepted a position as Research Specialist in the Pilotless Aircraft 
Division of Boeing Airplane Company, Seattle, Washington. 

Dr. John 8. White has accepted a position as statistician with the Aero Divi- 
sion of Minneapolis-Honeywell Regulator Company. 

Robert A. Wijsman is now Assistant Professor in the statistics group of the 
Department of Mathematics at the University of Illinois. 


oe 


New Members 
The following persons have been elected to membership in The Institute 


August 7, 1957, to November 1, 1957 


Abraham, T. C., M. Sc. (Karnatak Univ., India), Teaching Fellow, Boston University 
Graduate School, Boston University, Boston 15, Massachusetts; 627 Commonwealth 
Ave., Boston 16, Mass. 

Albert, Arthur E., M. S. (Stanford Univ.), Student, Department of Statistics, Stanford 
University, Stanford, California; 20 Russell Ave., Portola Valley, Calif. 

Ali, Asghar, M. A. (Univ. of North Carolina), Lecturer in Statistics, Jnstitute of Statistics, 
University of the Panjal, Lahore, Pakistan. 

Anglin, Ernie LaRue, B.S. (Univ. of Georgia), Student, Department of Mathematics, Uni- 
versity of Georgia, Athens, Georgia. 

Beatty, Richard L., M.S. (Univ. of Colorado), Instructor in Statistics, University of Wyo- 
ming, Laramie, Wyoming. 

Bland, Richard P., B. S. (Univ. of Oklahoma), Student, Department of Statistics, Uni- 
versity of North Carolina, Chapel Hill, North Carolina; 1 Audley Lane, Chapel Hill, 
N.C. 

Bobotek, Henry G., M. A. (Univ. of Illinois), Research Associate, Control Syste ms Lahora- 
tory, University of Illinois, Urbana, Illinois. 

Bogdanoff, J. L., Ph. D. (Columbia Univ.), Professor of Engineering Sciences, Purdue Uni 
versity, Lafayette, Indiana. 

Calhoun, David W., B. A. (Yale Univ.), Biometrician, G. D. Searle and Co., P. O. Box 5110, 
Chicago 80, Illinois; 820 Hamlin St., Evanston, Il. 

Caspers, James W., M. S. in E. E. (Univ. of Washington), Head, Applied Theoretical 
Studies Group, U. 8S. Navy Electronics Lab., San Diego 52, California; 5014 August 
St., San Diego 10, Calif. 

Champernowne, D. G., M. A. (Cambridge Univ.), Professor in Statistics and Fellow of 
Nuffield College, Oxford University, Nuffield College, Oxford, England. 

Chapman, James W., M.S. (Univ. of Minnesota), Research Assistant, Department of Soils, 
Institute of Agriculture, University of Minnesota, St. Paul 1, Minnesota. 

Clarke, Geoffrey M., M. A. (Oxon), Statistician, Department of Agriculture and Horti- 
culture, National Fruit and Cider Institute, University of Bristol; University of Bristol, 
Research Station, Long Ashton, Bristol, England. 

Cotton, John W., Ph. D. (Indiana Univ.), Assistant Professor of Psychology, Northwestern 
University, Evanston, Illinois; Department of Statistics, Eckhart Hall, University of 
Chicago, Chicago 37, Illinois. 

Cox, Constance E., M. 8. (Iowa State College), Head, Biometrics Section, Food and Drug 
Directorate, Department of National Health and Welfare, Tunney’s Pasture, Ottawa, 
Ontario, Canada. 

Dear, Robert E., Ph. D. (Univ. of Washington), Research Associate, Research Division, 
Educational Testing Service, 20 Nassau Street, Princeton, New Jersey. 

Engler, Jean, Ph. D. (Northwestern Univ.), National Science Foundation Postdoctoral 
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Fellow, Department of Statistics, University of North Carolina, Chapel Hill, North Caro- 
lina. 

Evans, Richard V., A. B. (Princeton Univ.), Student, Department of Industrial Engineer- 
ing, Johns Hopkins University, Baltimore 18, Md., Barroll Road, Baltimore 9, Md. 
Feldt, Leonard S., Ph. D. (State Univ. of Iowa), Assistant Professor, College of Education, 

State University of Iowa, Iowa City, Iowa. 

Fields, Raymond I., M. A. (Univ. of Arizona), Student, Virginia Polytechnic Institute, 
Blacksburg, Virginia; Speed Scientific School, University of Louisville, Louisville, Ken- 
tucky. 

Figge, Harry J., President, Harry J. Figge and Associates, 800 Liherty Building, Des Moines 
9, Iowa 

Flanagan, Richard M., B. A. (Univ. of Michigan), Senior Programmer, Argonaut Insurance 
Group, 250 Middlefield Road, Menlo Park, California. 

Gordon, M. H., Ph. D. (Univ. of Tennessee), Assistant Director, Central Neuropsychiatric 

tesearch Unit, Veterans Administration Hospital, Perry Point, Maryland; Bor 546, 
Perry Point, Maryland. 

Govindarajulu, Z., M. A. (Univ. of Madras), Graduate Student, Statistics, and Teaching 
Assistant, Biostatistics Division, University of Minnesota, Minneapolis 14, Minnesota. 

Green, Lloyd G., B. A. (Washington Missionary College), Mathematician, Touche, Niven, 
Bailey and Smart, 1380 National Bank Building, Detroit 26, Michigan. 

Halton, John H., M. A. (Oxon), Research-Student in Faculty of Physical Sciences, Balliol 
College, Orford University, Oxford, England. 

Hancock, John V., B.S. (Memphis State Univ.), Research Assistant, Department of Mathe- 
matics, University of Georgia, Athens, Georgia. 

Harrison, Gerald, Ph. D. (Calif. Institute of Technology), Mathematician, The Teleregister 
Corp., 445 Fairfield Avenue, Stamford, Connecticut. 

Heinhold, Josef, Dr. rer. nat. (Technische Hochschule Mv chen), Professor, Institut fur 
Angewandte Mathematik, Technische Hochschule Muncher, Munchen 2, NW, ArcisstraBe 
21, Germany. 

Hicks, Charles R., Ph. D. (Syracuse Univ.), Associate Profe sor of Mathematics and Re- 
search Associate in the Statistical Laboratory, Statistical Laboratory, Engineering 
Administration Building, Purdue University, Lafayette, Indiana. 

Hoyland, Arnijot, Cand. real (Univ. of Oxlo), Research Assistant, Forsikringsteknish 
Seminar, University of Oslo, Blindern pr. Oslo, Norway; Krokvolden 20, Stabekk pr. 
Oslo, Norway 

Iversen, Iver Andrew, B. A. (Univ. of Minnesota), Teaching Assistant, University of Min- 
nesota, Minneapolis 14, Minnesota; 420 Fifth Street S.E., Minneapolis 14, Minnesota. 

Jacobsen, Fred M., Jr., Ph. D. (Iowa State College), Group Leader, Computer Program 
ming and Mathematical Analysis, American Oil Co., Box 401, Texas City, Texas; 
Box 1537, Texas City, Texas. 

Jones, Alfred Welwood, Ph. D. (Columbia Univ.), Systems Engineer, Bell Telephone 
Laboratories, 463 West Street, New York 14,N.Y. 

Kakeshita, Shin’ichi, B. Sc. (Kyushu Univ.), Student, Math. Inst., Fac. Sei., Kyushu 
University, Fukuoka, Japan. 

Kim, Dong Sie, B. S. (Seoul National Univ.), Assistant, Dept. of Mathematics, Seoul Na 
tional University, Seoul, Korea; 11-44 Ka-heo-Dong, Chong-no-Ku, Seoul, Korea. 

Knapp, Leslie E., B. S. and B. A. (Upper Iowa Univ.), Student, Stanford University, 
Stanford, California; 1255 Tucson Avenue, Sunnyvale, California. 

Lamm, Richard A., M. A. (Hofstra College), Analytical Statistician, Chemical Corps R 
and D Command, Biological Warfare Laboratories, Fort Detrick, Frederick, Maryland; 
75 Stewart Maror, Frederick, Maryland. 

Laubscher, N. F., M. Sc. (Potchefstroomse Universiteit vir C.H.O.), A. Research Officer, 
South African Council for Scientific and Industrial Research, National Physical Research 
laboratory, P. O. Bor 395, Pretoria, South Africa. 
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Lindeman, Richard H., M. S. (Univ. of Wisconsin), Research Fellow, Bureau of Institu- 
tional Research, University of Minnesota, Minneapolis 14, Minnesota; 4936 Penn Ave. 
South, Minneapolis 9, Minn. 

Lokki, Olli Kristian, Dr. phil. (Univ. of Helsinki), Associated Professor, Institute of Tech- 
nology, Helsinki, Finland. 

Lunneborg, Clifford E., B. S. (Univ. of Washington), Research Assistant, Division of 
Counseling and Testing Services, University of Washington, Seattle, Washington; 
§218 16th Avenue N.E., Seattle 5, Washington. 

McGuire, Judson Ulery, Jr., Ph. D. (Iowa State College) Entomologist, Agricultural Re- 
search Service, U. S. Department of Agriculture, Washington, D. C.; Apartado 654, 
Camaguey, Cuba. 

Mikami, Misao, D. Sc. (Kyushu Univ.), Professor of Industrial Statistics, Seminar of In- 
dustrial Statistics, Faculty of Engineering, Kyushu University, Fukuoka, Japan. 

Mitra, S. S., M. S. (Univ. of Calcutta), Graduate Student and Teaching Assistant, De- 
partment of Mathematics, University of Washington, Seattle 5, Washington. 

Miyasawa, Koichi, D. Sc. (Kyushu Univ.), Assistant Professor of Mathematical Statistics 
and Econometrics, Faculty of Economics, Tokyo University, Tokyo, Japan. 

Pendergrass, R. N., M. A. (Univ. of Missouri), Professor of Mathematics, Radford College, 
Radford, Virgina. 

Pike, M. C., B. S. (Witwatersrand Univ., South Africa), Student, Statistical Laboratory, 
University of Cambridge, Trinity Hall, Cambridge, England. 

Redus, Faye, B. S. (Stephen F. Austin State College), Senior Analyst and Programmer, 
Sutherland Co., Suite 1112, First National Bank Building, Peoria, Illinois; 434 W. 21 
Street, San Bernardino, California. 

Reed, James C., Ph. D. (Univ. of Chicago), Director, Reading and Study Skills, Wayne 
State University, Detroit, Michigan. 

Rice, Philip L., B. S. (Principia College), Chief, Tropospheric Analysis Section, Radio 
Propagation Engineering Division, National Bureau of Standards, Boulder, Colorado; 
1103 Pine Street, Boulder, Colorado. 

Rogerson, G. W., B. 8. (Melbourne), Student, Melbourne University, Carlton, Melbourne 
N3, Victoria, Australia; 48 Drummond St., Carlton, Melbourne N3, Victoria, Australia. 

Romano, Albert, M. A. (Washington Univ.), Student and Research Assistant, Dept. of 
Statistics, Virginia Polytechnic Institute, Blacksburg, Virginia. 

Sagi, Philip C., Ph. D. (Univ. of Minnesota), Research Associate, Office of Population Re- 
search, 5 Ivy Lane, Princeton, New Jersey. 

Schoderbek, Joseph J., M.S. (Carnegie Inst. of Tech.), Research Engineer, Missile Systems 
Division, Lockheed Aircraft Corp., Palo Alto, California; 443 Carla Court, Mountain 
View, Calif. 

Schwartz, A. J., B.S. (Wayne Univ.), Student, Wayne State University, Detroit, Michigan; 
18478 Prest, Detroit, Michigan. 

Shaffer, Douglas H., Ph. D. (Carnegie Inst. of Technology), Mathematician, Westinghouse 
Research, Pittsburgh 35, Pa. 

Sherman, Seymour, Ph. D. (Cornell Univ.), Professor, Moore School of Electrical Engineer- 
ing, University of Pennsylvania, Philadelphia 4, Pennsylvania. 

Singh, Rajinder, M. A. (Panjale Univ., India), Graduate Assistant, Universit 
Department of Mathematics, Urbana, Illinois. 

Spicer, Ira G., B.S. (Univ. of Minnesota), Development Engineer, Project Leader of Tech- 
nical Analysis, Minneapolis-Honeywell, Ordnance Engineering, Hopkins, Minnesota; 
1928 Emerson Ave. So., Apt. 1-D, Minneapolis 5, Minnesota. 

Trammell, Carol D., B.S. (Carnegie Inst. of Tech.), Graduate Student and Teaching As- 
sistant, Department of Mathematics, Carnegie Institute of Technology, Pittsburgh 13, 
Pennsylvania. 

Weiler, H., M.S. (N.S.W. Univ. of Technology), Research Officer, CS7RO, Shcep Biology 
Laboratory, P.O. Box 144, Paramatta, N.S.W., Australia. 


y of Illinois, 
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Woods, W. Max, M.S. (Oregon Univ.), Student, Stanford University, Stanford, California; 
1919 Manhattan Ave., Apt. 4, East Palo Alto, California. 

Youtcheff, John S., A. B. (Columbia Univ.), Operations Analyst, General Electric Com- 
pany, Missile and Ordnance Systems Dept., Philadelphia, Pennsylvania; Post Office 
Boz 155, Berwyn, Pennsylvania. 


Congratulations to the Office of Naval Research 


At the request of the Council, the President of the Institute of Mathematical 
Statistics has written a congratulatory letter to Admiral Bennett of the Office 
of Naval Research in connection with the tenth anniversary of the Office of 
Naval Research. The text of the letter follows. 

Dear Admiral Bennett: 

At the recent annual meeting of the Institute of Mathematical Statistics, 
the Council unanimously asked me to offer our congratulations and best 
wishes on the tenth anniversary of the establishment of the Office of Naval 
Research. 

Through its support of senior investigators and graduate students and the 
consequent publication of many important technical papers and books, the 
Office of Naval Research has been contributing greatly to the advancement 
of fundamental research in mathematical statistics and probability theory. 
This contribution is especially important because it is being made during a 
period when these fields are showing themselves capable of particularly rapid 
growth. 

The help you have given our profession is but one aspect of that program 
through which government and science work hand in hand to the benefit of 
each and to that of the nation as a whole, both in military and civilian pur- 
suits. The Navy Department is outstanding in this respect, and it must be 
pleased and honored by the record of your Office. 

Congratulations to you and your staff and best wishes for many more years 
of success in your efforts. 

Respectfully yours, 
Leonard J. Savage 
President 


ao 


Fifth Annual Southern Regional Graduate Summer Session in Statistics: 


The fifth Southern Regional Graduate Summer Session in Statistics will be 
held June 16 through July 26, 1958, at Oklahoma State University, Stillwater, 
Oklahoma. The summer sessions are rotated annually among the following 
institutions: Virginia Polytechnic Institute, Oklahoma State University, Uni- 
versity of Florida and North Carolina State College. 

The program may be entered at any session, and consecutive courses will be 
cffered in successive summers. The summer work in statistics may be applied 
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towards residence requirements at any one of the cooperating institutions, as 
well as certain other institutions, in partial fulfillment of residence requirements 
for graduate degrees. Each annual summer session lasts six weeks and the several 
courses offered carry three semester hours of graduate credit. 

The summer sessions are designed to carry out a recommendation of the 
Southern Regional Education Board’s Committee on Statistics, on which the 
four institutions initiating the program are represented. 

The sessions will be of particular interest to (1). research and professional 
workers who want intensive instruction in basic statistical concepts and who 
wish to learn modern statistical methodology, (2) teachers of elementary sta- 
tistics courses who want some formal training in modern statistics, (3) prospec- 
tive candidates for graduate degrees in statistics, (4) graduate students in other 
fields who desire supporting work in statistics, and (5) professional statisticians 
who wish to keep informed of advanced specialized theory and methods. 

The faculty for the 1958 Summer Session at Oklahoma State University will 
include the following visiting professors: H. O. Hartley, Statistical Laboratory, 
Towa State College; Walter T. Federer, Biometrics Unit, Cornell University; 
John E. Freund, Department of Mathematics, Arizona State College; A. W. 
Wortham, Operations Research Department, Texas Instruments, Dallas, Texas. 

The local staff includes: Carl E. Marshall (Ph.D., Iowa State), Franklin 
Graybill (Ph.D., Iowa State), Robert D. Morrison (Ph.D., North Carolina 
State, John Hamblen (Ph.D., Purdue), Roy Deal (Ph.D., University of Okla- 
homa), and John Hoffman (Ph.D., University of Oklahoma). 

Of particular interest at this summer session will be the six weekly symposia 
covering six important areas in statistics. They are: Sampling Survey Designs, 
Experimental Designs, Non-parametric Statistics, Response Curves and Sur- 
faces, Multiple Comparisons, and High Speed Computing. Discussants will be 
selected from major contributors to these areas. These invited speakers together 
with the outstanding summer school staff will cover the respective subjects from 
three points of view: applications, their mathematical bases, and the problems 
that lie on the frontier. 

Inquiries should be addressed to Carl E.. Marshall, Director, Statistical Labora- 
tory, Oklahoma State University, Stillwater, Oklahoma. 

PRs 
List of International and Foreign Scientific and Technical Meetings 
October 1, 1957 through 1960 


(The following information was extracted from a list compiled by the Office of Scientific 
Information of the National Science Foundation.) 


Date and Place Meeting, Sponsor and Subject Address Queries to: 
Oct. 21, 1957 2nd Inter-African Conference Secretariat, Commission for 
Lourenco Marques, on Sratistics, Inter-African Technical Cooperation in Af- 
Mozambique Committee of Statistics rica South of the Sahara, 43 
Parliament Street, London, 
S. W. 1, England 





Date and place 


Dec. 26, 1957—Jan. 4, 
1958 
Berkeley, California 


Apr. 9-13, 1958 
Giessen, Germany 


Apr. 13-16, 1958 
Giessen, Germany 


June 1958 
Strasbourg, France 


Aug. 11-13, 1958 

St. Andrews, Scot- 
land 

Aug. 14-21, 1958 

Edinburgh, Scotland 


Sept. 3-10, 1958 
Namur, Belgium 


Sept. 1958 
Brussels, Belgium 


1958 Undecided 
Warsaw, Poland 


NEWS AND NOTICES 


Meeting, Sponsor and Subject 


Symposium on  AxIOMATIC 
Mertuop, with Special Refer- 
ence to Geometry § and 
Physics 

Society for ArpLiep MaTHE- 
MATICS AND MECHANICS 
(GAMM), Annual Meeting 


German MaTHEMATICcS SoclETY 
(DMV), Annual Meeting 


International Association for 
ANALOGY CoMPUTATION, 2nd 
International Meeting 
Ist General Assembly 

International MaTHEMATICAL 
Union, 3rd General Assembly 


and 


llth International Congress of 
MATHEMATICIANS—Logic and 
foundations, algebra and the- 
ory of numbers; analysis; 
topology; geometry; proba- 
bility and statistics; applied 
mathematics, mathematical 
physics and numerical analy 
sis; and history and educa- 
tion 

2nd International Congress for 
CYBERNETICS, Association 
Internationale de Cyberne- 
tique (ASBL)—Information 
—Automatism (Cybernetics 
applied to machinery)—Au- 
tomation (Cybernetics used 
in organizing labor)—The 
economical and social conse- 
quences of Automation—Cy- 
bernetics 
ences -— 
biology 

International Sratisticat In- 
stitute, Special Session 


sci- 
and 


and social 


Cybernetics 


International Symposium on 
NONHOMOGENEITY IN Etas- 
TICITY AND Puasticity, Inter- 
national Union for Theoreti- 
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Address Queries to: 


Professor Alfred Tarski, Depart- 
ment of Mathematics, Univer- 
sity of California, Berkeley 4, 
California 

Professor Dr. Egon Ullrich, 
Mathematisches Institut der 
Justus Liebig-Hochschule, 
Johannesstrasse 1, (16) Gies- 
sen, Germany 

Dr. Egon Ullrich, 
Mathematisches Institut der 
Justus Liebig-Hochschule, 
Johannesstrasse 1, (16) Gies- 
sen, Germany 

Professor J. Hoffmann, Uni- 
versite Libre, 50 Avenue 
Franklin D. Roosevelt, Brus- 
sels, Belgium 

Mr. F. Smithies, Mathematical 
Institute, 16 Chambers Street, 
Edinburgh 1, Scotland 

Mr. F. Smithies, Mathematical 
Institute, 16 Chambers Street. 
Edinburgh 1, Scotland 


Professor 


International Association for 
Cybernetics, 13, rue Basse 


Marcelle, Namur, Belgium 


Institut National de Statis 
tique, 44, rue de Louvain, 
Brussels, Belgium 

Dr. Hugh L. Dryden, President 
of Union, NACA, 1512 H 
Street, N. W., Washington, 
D.C.; or Professor F. N. van 
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Date and place Meeting, Sponsor and Subject Address Queries to; 
cal and Applied Mechanics den Dungen, Secretary of 
(IUTAM) Union, 41 avenue de |’Arba 
lete, Boitsfont, Brussels, Bel 
gium 
1960 Undecided 10th International Congress of | Dr. Hugh L. Dryden, President 
Stresa, Italy AppLiED MEcHANICcs, Inter- of Union, NACA, 1512 H 
national Union of Theoretical Street, N. W., Washington, 
and Applied Mechanics D. C.; or Professor F. N. van 
(IUTAM den Dungen, Secretary of 
Union, 41 avenue de 1l’Arba 
lete, Boitsfont, Brussels, 
Belgium 


en 


Royal Statistical Society Research and Industrial Applications Sections 


The Research Section and the Industrial Applications Section of the Royal 
Statistical Society intend to hold a Conference at the University of St. Andrews, 
near Edinburgh, Scotland, from 22 August to 1 September inclusive. It will 
be devoted to Mathematical Statistics, with special reference to those branches 
of the subject which have application in industry. 

It is proposed that there should be three morning sessions, consisting each of 
two or three pre-arranged lectures, and three early evening sessions (5.30 p.m. 
to 6.30 p.m.) each with one pre-arranged lecture. 

The afternoons will be devoted to ‘Splinter Groups’ which will devote them- 
selves to special aspects, and at which informal talks of some ten or fifteen 
minutes each can be given without prior arrangement. 

Topics to be covered in the morning and evening sessions include aspects of 
the analysis of variance, non-parametric inference, stochastic aspects of linear 
and dynamic programming, and foundations of probability in statistics. 

It is hoped that many of the mathematical statisticians who will be coming 
from abroad to attend the Edinburgh International Congress of Mathematicians, 
will choose to remain in Scotland for a further few days, and take the opportunity 
of meeting colleagues specially interested in their field. St. Andrews, besides 
having the famous Golf Course, is a small Scottish town of considerable char- 
acter, and a very good centre for the exploration of the Eastern Highlands. 

Accommodation (from 21 August to 2 September) will be provided within the 
hostels of the University of St. Andrews at a reasonably low cost, details to follow 
later. Anyone interested should write, marking the envelope ‘ST. ANDREWS 
CONFERENCE’ to Miss U. Croker, Royal Statistical Society, address as 
above. 


wae 


University of Michigan Graduate School of Public Health Summer Program 


The University of Michigan is the host institution for a cooperative program 
by the accredited Schools of Public Health of the United States during the 
summer of 1958. 
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The summer program for 1958 is designed to meet some of the educational 

and training needs of men and women engaged in work in health and health 
related agencies or those preparing themselves for such work. Courses are offered 
at three levels. The elementary level courses are intended for those who have 
acquired little or no background in statistical methodology. Intermediate 
courses present subject matter to extend and improve knowledge and skills of 
those persons who have acquired the elementary concepts and skills of statistical 
methodology. The advanced level courses are for those who have acquired 
considerable background in the theory and application of statistical concepts 
and procedures. A seminar open to all students includes the presentation of 
topics of current national interest related to health sciences and statistical 
methodology. 
The faculty will consist of Helen Abbey, The Johns Hopkins University; William 
G. Cochran, Harvard University; Jerome Cornfield, The National Institutes of 
Health; Bernard Greenberg, University of North Carolina; F. M. Hemphill, 
University of Michigan; Leslie Kish, University of Michigan; Donovan Thomp- 
son, University of Pittsburgh; Colin White, Yale University. 

If possible, completed applications and transcripts should reach Ann Arbor 
by June 1, 1958, for Michigan residents and May 1, for nonresidents. Requests 
for application forms should be addressed to the Director of the Summer Program 
in Public Health Statistics, School of Public Health, University of Michigan, 
Ann Arbor, Michigan. 

A limited number of scholarships will be available to qualified students taking 
courses for credit. Inquiries concerning scholarships should be addressed to Dr. 
F. M. Hemphill, Director of the Summer Program in Public Health Statistics, 
School of Public Health, University of Michigan, Ann Arbor, Michigan. 


or 


IMS MEMBERS ATTENDING THE 1957 ANNUAL MEETING OF THE IMS 


(This list was not received in time to be included with the report in the De- 
cember, 1957 issue.) 


Forman 8. Acton, Frank B. Akutowicz, William R. Allen, Allan G. Anderson, R. L. 
Anderson, Virgil L. Anderson, William B. Anderson, Abdur Rahman Ansari, F. J. Anscombe, 
Kenneth J. Arnold, Samuel I. Askovitz, Ralph Hoyt Bacon, J. C. Bain, Theodore A. Ban- 
croft, Rolf E. Bargmann, Glenn E. Bartsch, W. D. Baten, Grace E. Bates, Geoffrey Beall, 
Helen P. Beard, Robert Eric Bechofer, Charles Bernard Bell, Jr., Irving Belson, Andrew 
Angelo Benvenuto, Agnes P. Berger, Joseph Berkson, Gerald D. Berndt, Max A. Bershad, 
Reid A. T. Bhaucha, Charles A. Bicking, Patrick Paul Billingsley, Richard 8. Bingham, Jr., 
Allan Birnbaum, David Blackwell, Herman Blasbalg, Chester I. Bliss, Julius R. Blum, 
Isadore Blumen, John B. Boddie, Derrill Joseph Bordelon, Raj C. Bose, Helen Bozivich, 
Ralph Allan, Bradley, A. E. Brandt, Leroy S. Brenna, Glenn W. Brier, Harold F. Bright, 
Samuel H. Brooks, Bernice Brown, Irwin D. J. Bross, Benjamin Buchbinder, Robert W. 
Burgess, Paul J. Burke, Irving W. Burr, Glenn L. Burrows, Lyle D. Calvin, Burton H. 
Camp, C.8. Callard, Mavis B. Carroll, Marvin F. Carter, Jack Chassan, Herman Chernoff, 
Victor Chew, Chin Long Chiang, John T. Chu, Joseph Louis Ciminera, Ira H. Cisin, Willard 
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H. Clatworthy, Andrew G. Clark, Charles William Clunies-Ross, William G. Cochran, 
Paul M. Cohen, William S. Connor, Clyde H. Coombs, Lewis C. Copeland, Richard G. 
Cornell, Jerome Cornfield, L. M. Court, Edwin L. Cox, Paul Charles Cox, Allen T. Craig, 
Elliot M. Cramer, Jean A. Crockett, Lee S. Crump, Edward Eugene Cureton, Joseph F. 
Daly, Cuthbert Daniel, Herbert Theodore David, Willis L. Davis, Read B. Dawson, Jr., 
W. Edwards Deming, Arthur P. Dempster, Cyrus Derman, Lucile Derrick, Ear! Louis 
Diamond, John K. Diederichs, James L. Dolby, Tom G. Donnelly, Acheson J. Duncan, 
David B. Duncan, Paul R. Dunlap, Charles W. Dunnett, David Durand, Arthur Morlan 
Dutton, Meyer Dwass, Albert Ross, Eckler, Jr., Sylvain Ehrenfeld, Churchell Eisenhart, 
Harry Eisenpress, Salah A. Elmaghraby, Lila R. Elveback, Daniel R. Embody, Walter T. 
Federer, A. V. Fend, Robert Ferber, George Emery Ferris, William B. Fetters, Donald 
Fraser, David Frazier, Spencer M. Free, Spencer Michael Free, Jr., Agnes M. Galligan, 
Donald A. Gardiner, Werner Gautschi, Charles E. Gates, Donald Paul Gaver, Seymour 
Geisser, Lincoln J. Gerende, George William Gershefski, B. C. Getchell, Walter M. Gilbert, 
Dorothy Morrow Gilford, Leon Gilford, Harold Glazer, William A. Glenn, Ramanathan 
Gnanadesikan, Leo A. Goodman, Mina H. Gourary, Bernard G. Greenberg, Samuel W. 
Greenhouse, Joseph Arthur Greenwood, Frank E. Grubbs, Lee Gunlogson, John Gurland, 
Donald Guthrie, Irwin Guttman, Robert John Hader, Max Halperin, James F. Hannan, 
Morris Howard Hansen, Robert H. Hanson, Bernard Harris, T. E. Harris, Boyd Harsh- 
barger, H. Leon Harter, Herman O. Hartley, William C. Healy, Jr., Paul Heit, F. M. Hemp- 
hill, G. Ronald Herd, Irene Hess, Clifford Hildreth, Wassily Hoeffding, Robert G. Hoffmann, 
John F. Hofmann, David Hogben, Paul G. Homeyer, Robert Hooke, Theodore Wright 
Horner, William H. Horton, Daniel G. Horvitz, Professor Harold Hotelling, Earl E. House- 
man, David R. Howes, Walter W. Hoy, Cyril J. Hoyt, John David Hromi, Harry M. Hughes, 
J. Stuart Hunter, David V. Huntsberger, Benjamin Jackson, John L. Jaech, T. A. Jeeves, 
Milton Vernon Johns, Jr., Howard L. Jones, Hyman B. Kaitz, Samuel Karlin, Abraham, E. 
Karp, Marvin A. Kastenbaum, Leo Katz, Mort Keats, Oscar Kempthorne, Robert W. 
Kennard, George H. Kennedy, Bradford F. Kimball, Edgar P. King, Cal J. Kirchen, Leslie 
Kish, Truman L. Koehler, Martin Krakowski, Clyde Y. Kramer, William C. Krumbein, 
William Kruskal, Morton Kupperman, George M. Kuznets, Mrs. Helen Humes Lamale, 
Donald E. Lamphiear, Fred C. Leone, Howard Levene, Alfred Lieberman, Gerald J. Lieber- 
man, Gilbert Lieberman, Jacob E. Lieberman, Julius Lieblein, Benjamin Lipstein, Stuart 
P. Lloyd, Frederic M. Lord, Eugene Lukacs, Bob Lundegard, George F. Lunger, John 
Hans MacKay, William G. Madow, Ralph A. Maggio, Clifford Joseph Maloney, Joseph 
Mandelson, Henry Berthold Mann, Eli S. Marks, Robert H. Matthias, Philip John Mc 

Carthy, Duncan C. McCune, Harlley Ellsworth McKean, Paul M. Meier, Margaret Merrell, 
W. Jay Merrill, Herbert A. Meyer, Paul D. Minton, Robert Mirsky, Sutton Monro, Alex 
M. Mood, Roger H. Moore, Donald Frank Morrison, Milton NMI Morrison, Norman Morse, 
Jack Moshman, Frederick Mosteller, Mervin E. Muller, R. B. Murphy, Jack Nadler, L. F 

Nanni, Raymond Nassimbene, Joseph Anthony Navarro, August A. Carl Nelson, Jr., Peter 
E. Ney, 8. I. Neuwirth, George E. Nicholson, Monroe L. Norden, Jack I. Northam, Horace 
W. Norton, Aloysius Joachim O’Connor, Junjiro Ogawa, Ingram Olkin, Paul S. Olmstead, 
Bernard Ostle, Donald B. Owen, William R. Pabst, Jr., Nancy S. Parker, Dr. Ellis F. Par- 
menter, Emanuel Parzen, John F. Pauls, Robert Nixon Pendergrass, B. E. Phillips, Eugene 
W. Pike, Edwin James George Pitman, Aloysius J. Polaneezky, Bruce P. Price, Ronald 
Pyke, Dana Edward Anthony Quade, Lila Knudsen Randolph, Herman Ravitch, Stanley 
Reiter, Elmer Edwin Remmenga, G. J. Resnikoff, William L. Roach, Jr., Spencer W. Roberts, 
Herbert Robbins, Douglas S. Robson, Robert Roeloffs, Harry M. Rosenblatt, Joan R. 
Rosenblatt, Murray Rosenblatt, Irving Roshwalb, 8S. N. Roy, Herman Rubin, David 
Rubinstein, Phillip Justin Rulon, Marion M. Sandomire, F. E. Satterthwaite, Sam Cundiff 
Saunders, Leonard 8. Savage, Marvin A. Schneiderman, Seymour Max Selig, Richard H. 
Shaw, Sidney Shtulman, Walt R. Simmons, Rosedith Sitgreaves, John H. Smith, Thaddeus 
L. Smith, Jean E. Smolak, Milton Sobel, Herbert Solomon, Paul N. Somerville, Frederick 
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A. Sorensen, Melvin Dale Springer, John J. Stansbrey, James Hall Stapleton, Roberg G. D. 
Steel, Arthur Stein, Frederick F. Stephan, Theodor D. Sterling, John N. Stewart, Ray B. 
Stiver, Jr., David S. Stoller, Samuel A. Stouffer, Jacques St. Pierre, Hale C. Sweeny, Zen 
Szatrowski, Robert J. Taylor, James G. C. Templeton, Benjamin J. Tepping, Milton E. 
Terry, Earl A. Thomas, William Alfred Thompson, Jr., George W. Thomson, Leo J. Tick, 
John W. Tukey, Malcolm E. Turner, Hubertus Robert Van der Vaart, Herman W. Von- 
Guerard, Helen M. Walker, David L. Wallace, W. Allen Wallis, John E. Walsh, Louis 
Weiner, Harry Weingarten, Irving Weiss, Phillips Whidden, Alfred G. Whitney, D. Ransom 
Whitney, John M. Wiesen, Frank Wilcoxon, Martin B. Wilk, Samuel 8. Wilks, John W. 
Wilkinson, Evan James Williams, Gregory Williams, Myron J. Willis, Russell Lowell Wine, 
Gerald Winston, Max A. Woodbury, G. Stanley Woodson, Charles Ashley Wright, Charles 
W. Wright, William J. Youden, Marvin Zelen, John Arthur Zoellner. 


es re 


Visiting Foreign Mathematicians 


The following list of visiting foreign mathematicians has been received from 
the Division of Mathematics, National Academy of Sciences—National Re- 
search Council. The information given is, in order, the name, home country, 
host institution, and period of visit; AY stands for academic year 1957-1958. 

—— 


Adams, John F.—U. K.—Institute for Advanced Study—AY; Adem, Jose—Mexico— 
Princeton University—Feb. 1958-June 1958; Akizuki, Yasuo—Japan—University of Chi- 
cago—Oct. 1, 1957-June 30, 1958; Albertoni, Sergio—Italy—New York Universitvy—Sept. 
1957-Feb. 1958; Andreotti, Aldo—Italy—Institute for Advanced Study (Sept. 30, 1957-Dec. 
20, 1957), Princeton University (Feb. 1958-June 1958)—Sept. 30, 1957-June 1958; Azumaya, 
Goro—Japan—Y ale University—Sept. 1956-Sept. 1958; Beale, E. M. L.—U. K.—Princeton 
University—Jan. 1958-Dec. 1958; Birch, Bryan J.—U. K.—Princeton University—Sept. 
1957-June 1958; Bjérck, Géran—Sweden—Institute for Advanced Study—AY; Bofinger, 
Victor J.—Australia—North Carolina State College—June 1957-April 1958; Burgers, 
Johannes M.—Netherlands—American University, National Bureau of Standards—Oct. 
11, 1956-Oct. 1957; Carleson, Lennart—Sweden—Massachusetts Institute of Technology— 
Sept. 1957-Jan. 31, 1958; Cartier, Pierre—France—Institute for Advanced Study—AY; 
Chakravorti, J. G—India—Brown University—AY; Chand, Uttam—India—Boston Uni- 
versity—Jan. 1958-May 1958; Cohen, Daniel E.—U. K.—Princeton University—Sept. 
1957-June 1958; Copson, E. T.—Scotland—Stanford Universit y—Week-—Feb. 1958; Corsten, 
L. C. A.—Netherlands—University of North Carolina—Sept. 1957-June 1958; Dedecker, 
Paul—Belgium—Institute for Advanced Study—Sept. 30, 1957-Dec. 20, 1957; Delsarte, 
Jean—France—University of Maryland—Apr. 1957-July 1957; Deny, Jacques—France — 
Institute for Advanced Study—Sept. 30, 1957-Dec. 20, 1957; de Rham, Georges—Switzer- 
land—Institute for Advanced Study—AY; Dold, Albrecht—Germany—Institute for 
Advanced Study—Sept. 1956-Aug. 1958; Dvoretzky, Aryeh—Israel—Institute for Ad- 
vanced Study—AY; Edwards, David A.—U. K.—Yale University—Sept. 1956-Sept. 1958; 
Ewald, Guenther—Germany—Michigan State University—Sept. 1957-June 1958; Festa, 
Rudolf—Austria—University of Alabama, 1956-57 (State College of Washington 1957-58) 
—Sept. 1956-Sept. 1958; Festa, Erika—Austria—State College of Washington (Sept.—Dec. 
1957)—Sept. 1956-Sept. 1958; Foguel, Shaul—Israel—New York University—AY; Frohlich, 
A.—U. K.—University of Virginia—Feb. 1958-June 1958; Gamblen, Frank—Australia— 
University of Kansas (Sept. 4, 1957-Feb. 1, 1958), Educational Testing Service, Princeton, 
N. J. (Feb. 1, 1958-June 1958)—Sept. 4, 1957-June 1958; Gautschi, Walter—Switzerland 
—American University, National Bureau of Standards—Oct. 1955-Sept. 1958; Ghaffari, A. 
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—Iran—American University, National Bureau of Standards—Sept. 1956-Sept. 1958; 
Goldner, Siegfried—Union of South Africa—New York University—AY; Grauert, Hans— 
Germany—Institute for Advanced Study—AY; Grenander, U.—Sweden—Brown Univer- 
sity—AY; Griffiths, Hubert B.—U. K.—Institute for Advanced Study—Sept. 1956-July 
1958; Guttman, Irwin—Canada—Princeton University—Sept. 1957-June 1958; Hano, 
Jun-ichi—Japan—University of Washington—Sept. 15, 1957-June 15, 1958; Harrop, Ronald 
—U. K.—Pennsylvania State University—Aug. 1957-Aug. 1958; Hellman, Olavi B.—Fin- 
land—University of California, Los Angeles—July 1956-July 1958; Helmberg, Gilbert— 
Austria—University of Washington—October |, 1957-June 15, 1958; Hervé, Michel—France 
—Institute for Advanced Study—Sept. 30, 1957-Dec. 20, 1957; Hirsch, Guy—Belgium— 
Massachusetts Institute of Technology—Feb. 1, 1958-June 15, 1958; Hitotumatu, Sin 

Japan—Stanford University—Sept. 1, 1957-June 30, 1958; Hiisser, Rudolph— Switzerland 
—University of California, Los Angeles, AY; Izumi, Shin-ichi—Japan—University of 
Chicago and Northwestern University (Aug. 1, 1957-May 31, 1958), Princeton University 
(Oct. 1957-Dec. 1957) —Aug. 1957-May 1958; Kato, Tosio—Japan—New York University 
—Sept.—Oct. 1957; Kawata, T.—Japan, Princeton University—Sept. 1957-March 1958; 
Kitawaga, T.—Japan—Princeton University—Sept. 1957-March 1958; Klingenberg, Wil 
helm—Germany—Institute for Advanced Study—AY; Laasonen, V. Pentti J.—Finland 

University of California, Los Angeles—July 1956-Aug. 1958; Lacombe, Daniel L. M. 

France—Institute for Advanced Study—Oct. 1957-Aug. 1958; Lehto, Olli E.—Finland—In- 
stitute for Advanced Study—AY; Leopoldt, Heinrich W.—Germany—Institute for Ad 
vanced Study—Sept. 1956-Aug. 1958; Leray, Jean—France—Institute for Advanced 
Study—Sept. 30, 1957-Dec. 20, 1957; Lions, Jaeques—France—University of Kansas—Feb. 
1957-Aug. 1958; Longuet-Higgins, Michael 8.—U. K.—Massachusetts Institute of Tech- 
nology—Feb. 1, 1958-June 15, 1958; Lorenzen, Paul P. W.—Germany—Institute for Ad- 
vanced Study—Sept. 1957-June 1958; Lucas, John R.—U. K.—Princeton University 

Sept. 1957-June 1958; Lumer, Giinter—Uruguay, University of Chicago—Oct. 1, 1957- 
Sept. 30, 1958; Mallows, Colin L.—U. K.—Princeton University—Sept. 1957-Sept. 1958; 
MardeS&ié, Sibe—Jugoslavia—Institute for Advanced Study—AY; Martin, Alfred I. —U. K. 
—Institute for Advanced Study—AY; Masani, Pesi—India—Massachusetts Institute of 
Technology—Harvard University—Sept. 16, 1957-June 15, 1958; Message, Philip J.—U. K. 
—Yale University—Sept. 1957-Sept. 1958; Milne-Thomson, L. M.—U. K.—Brown Uni- 
versity—Sept. 1956-June 1958; Mixner, Joseph—Germany—New York University—Aug- 
Oct. 1957; Moller, Christian—Denmark—Carnegie Institute of Technology—Sept. 1957- 
Feb. 1958; Nachbin, Leopoldo—Brazil—University of Chicago—Oct. 1, 1956-July 31, 1958; 
Nagata, Masayoshi—Japan—Harvard University—Sept. 1957-Sept. 1958; Nieminen, 
Toivo E.—Finland—New York University—Aug. 1957-June 1958; Ogawa, Junjiro—Japan 
—Institute of Statistics, University of North Carolina—Sept. 1956-Aug. 31, 1958; Olver, 
F. W. J.—U. K.—National Bureau of Standards—Sept. 30, 1957-Sept. 1958; O'Meara, 
Onorato T.—South Africa—Institute for Advanced Study—AY; Ono, Katsuzi—Japan— 
Massachusetts Institute of Technology—Sept. 1957-June 1958; Ostrowski, Alexander M. 
—Switzerland—American University, National Bureau of Standards—Oct. 10-31, 1957; 
Papakyriakopoulos, C. D.—Greece—Institute for Advanced Study—June 3, 1955-June 
1958; Peixoto, Mauricio M.—Brazil—Princeton University—Sept. 1957-June 1958; Pfluger, 
Albert—Switzerland—Stanford University—Oct. 1, 1957-Mar. 30, 1958; Pucci, Carlo—Italy 
—University of Maryland—Sept. 1, 1956-July 1958; Puppe, Dieter—Germany—lInstitute 
for Advanced Study—Sept. 1957-June 1958; Rieger, Georg J.—Germany—University of 
Maryland—Sept. 1956-Aug. 1957; Riesz, Marcel—Sweden—University of Maryland—Oct. 
1, 1957-Dec. 31, 1957; Robinson, Leslie R. B.—U. K.—Harvard University—Sept. 1957- 
Sept. 1953; Rogosinski, Werner W.—U. K.—University of Colorado—Sept. 1957-Sept. 
1958; Rohrbach, Hans—Germany—University of North Carolina—Sept. 1957-June 1958; 
Room, T. G.—Australia—Institute for Advanced Study (AY); Princeton University (Feb. 
1958-June 1958)—Sept. 1957-June 1958; Roseau, Maurice—France—New York University 
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—Sept. 1957-Sept. 1958; Sawyer, W. W.—U. K.—University of Illinois—Feb. 1957—Indefi- 
nite; Schopf, Andreas—Switzerland—American University-National Bureau of Standards 
—Sept. 30, 1957-Oct. 1958; Scriba, J. Cristoph—Germany—University of Kentucky— 
Sept. 1957-June 1958; Selberg, Sigmund—Norway—University of Colorado—Sept. 1957- 
May 1958; Serre, Jean-Pierre—France—Institute for Advanced Study—Sept. 30, 1957- 
Dec. 20, 1957; Skolem, Thoralf A——Norway—University of Notre Dame—Sept. 1957-June 
1958; Stoll, Wilhelm—Germany—Institute for Advanced Study—AY; Tamagawa, Tsuneo 

Japan—lInstitute for Advanced Study (Sept. 1955-Jan. 1957 and Jan. 13, 1958-Apr. 11, 
1958), Johns Hopkins University (Jan. 1957-Jan. 1958)—Sept. 1955-Apr. 1958; Tomonaga, 
Yasuro—Japan—University of Washington—Oct. 15, 1957-June 15, 1958; Valpola, Veli 
Kustaa—Finland—University of California (2 months), Princeton University (4 months) 

Nov. 1957—April 1958; van der Vaart, H. R.—Netherlands—North Carolina State College 
—Jan. 1957-Jan. 1958; Villamayor, Orlando—Argentina—Institute for Advanced Study— 
Jan. 1, 1957—Dec. 31, 1957; Waelbroeck, L.—Belgium—Institute for Advanced Study—AY; 
Watson, Geoffrey S.—Australia—North Carolina State College (4 months), Princeton 
University (5 months)—April 1958-Dec. 1958; Williams, Robert M.—U. K.—Princeton 
University—Sept. 1957-June 1958; Wolff, Emil—U. K.—New York University—June-Nov. 
1957; Yamamuro, Sadayuki—Japan—Institute for Advanced Study—Sept. 1956-Sept. 
1958; Yevdjevich, V. M.—Yugoslavia—American University, National Bureau of Stand- 
ards—AY; Yiiksel, H—Turkey—Brown University—AY; Zadunaisky, Pedro—Argentina 

Watson Scientific Computing Laboratory—Feb. 1, 1957-Jan. 31, 1958; Agudo, F. R. D.— 
Portugal—University of California, Berkeley Fall 1957; Baayen, Pieter C.—Netherlands— 
University of California, Berkeley—AY; Fary, Istvan—Canada (Hungary )—University 
of California, Berkeley Jan.-June 1958; Festa, Erika—Austria—State College of Washington 
Sept.-Dec. 1957; Lightstone, A. H.—Canada—University of California, Berkeley—AY; 
Littlewood, J. E.—England—University of California, Berkeley Sept. 27-Dec. 18, 1957; 
Poulsen, Ebbe T.—Denmark—University of California, Berkeley July 1957-June 1958; 
Specker, Ernst Paul—Switzerland—Cornell University Feb. 10-Sept. 1958; Szmielew, 
Wanda—Poland—University of California, Berkeley—AY. 


a 


Committee on Mathematical Tables 


Since its organization early in 1956, the Institute of Mathematical Statistics’ 
Committee on Mathematical Tables has been concerned with the problems 
associated, either directly or indirectly, with the computation of mathematical 
tables of interest to statisticians. The committee’s function is threefold: 

(i) To gather information relating to the tabulation of functions of interest 
to statisticians. 

(ii) To advise on the need for, and preparation of, statistical tables. 

(iii) To determine the availability of and coordinate the distribution of free 
time on high speed digital computers for the computation of statistical 
tables. 

In order to fulfill its function, the committee investigated the interests and 
needs of the Institute membership concerning statistical tables and as a result 
set up nine subcommittees covering the areas of greatest interest. To date the 
activities of these subcommittees have been directed primarily towards the 
preparation of bibliographies in their individual fields. The subcommittees, 
along with their chairmen, are listed below. 





NEWS AND NOTICES 


. Chi-Square, 
W. Kruskal, University of Chicago 
t-Distribution (Univariate and Multivariate), 
C. W. Dunnett, American Cyanamid Company, Pearl River, New York. 
. Studentized Range, 
A. H. Bowker, Stanford University 
F-Distribution (Incomplete-Beta, Binomial), 
E. E. Cureton, University of Tennessee. 
Hypergeometric Distribution (Not the hypergeometric function), 
W. Kruskal, University of Chicago. 
Polyvariate Normal Distribution, including latent roots, 
G. P. Steck, Sandia Corporation, Albuquerque, New Mexico 
Availability of Simple Techniques, 
Chairman not appointed. 
. Annals Supplement (Of statistical tables), 
J. W. Tukey, Bell Telephone Laboratories, Murray Hill, New Jersey 
9. Computing Facilities and Cost-Free Machine Time, 
F. C. Leone, Case Inst. of Tech. 
Additional information on any of the above activities may be obtained from 
the chairman of the Committee on Mathematical Tables, D. B. Owen, Sandia 
Corporation, Albuquerque, New Mexico, or from any of the subcommittee 
chairmen. Anyone having time on a digital computer which may be made 
available on a cost-free basis to persons desiring to compute tables of general 
interest is invited to contact the chairman of subcommittee 9. 


1. List of Members of the IMS Committee on Mathematical Tables 


Chairman 


Dr. D. B. Owen, Division 5125, Sandia Corporation, Albuquerque, New Mexico 


Secretary and Vice Chairman 


Dr. G. P. Steck, Division 5125, Sandia Corporation, Albuquerque, New Mexico 


Members 

Professor R. L. Anderson, Institute of Statistics, North Carolina State College, Raleigh 
North Carolina 

Professor A. H. Bowker, Department of Statistics, Stanford University, Stanford, Cali 
fornia 

Professor E. E. Cureton, 1846 Prospect Pl., S.E., Knoxville 15, Tennessee 

Professor W. J. Dixon, University of California, Department of Preventive Medicine and 
Public Health, Medical Center, Los Angeles 24, California 

Mr. C. W. Dunnett, 19 Edsall Place, Nanuet, New York 

Dr. Churchill Eisenhart, Chief, Statistical Engineering Laboratory, National Bureau of 
Standards, Washington 25, D. C. 

Dr. J. A. Greenwood, 16 Garfield Street, Cambridge 38, Massachusetts 

Professor H. O. Hartley, Statistical Laboratory, Iowa State College, Ames, Iowa 

Professor William Kruskal, Committee on Statisties, Eckhart Hall, University of Chicago, 
Chicago 37, Illinois 
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Professor Fred C. Leone, Director, Statistical Laboratory, Case Institute of Technology, 
10900 Euclid Avenue, Cleveland 24, Ohio 

Professor Dan Teichroew, Graduate School of Business, Stanford University, Stanford, 
California 

Dr. John W. Tukey, Bell Telephone Laboratories, Murray Hill, New Jersey 

Professor M. A. Woodbury, 401 W. 205th Street, New York 34, New York 

Dr. Marvin Zelen, Statistical Engineering Laboratory, National Bureau of Standards, 
Washington 25, D. C. 


Ex officio members 


Professor L. J. Savage, Eckhart Hall, University of Chicago, Chicago 37, Illinois 
Professor Jacob Wolfowitz, Dept. of Math, Cornell University, Ithaca, New York 


II. Subcommittees of the IMS Committee on Mathematical Tables 
1. Chi-square 
*William Kruskal 
*Dan Teichroew 


2. t-distributions (univariate and multivariate) 

*C. W. Dunnett, Chairman 

Dr. H. A. David, Department of Statistics, Virginia Polytechnic Institute, Blacks- 
burg, Virginia 

Dr. 8. S. Gupta, Department of Mathematics, University of Alberta Edmonton, 
Alberta, Canada 

*H. O. Hartley 

Professor E.S. Keeping, Math Department, University of Alberta, Edmonton, Alberta, 
Canada 

Professor C. F. Kossack, Math Department, Purdue University, West Lafayette, 
Indiana 

Dr. A. M. Mood, General Analysis Corporation, 11753 Wilshire Boulevard, West 
Los Angeles, California 

J.B. Rabin, Sen. Computer Analyst, Burroughs Corporation, 1505 Sycamore Avenue, 
Willow Grove, Pennsylvania 

Dr. M. Sobel, Bell Telephone Lab., 555 Union Boulevard, Allentown, Pa 

*Dan Teichroew 


3. Studentized range 
*A.H. Bowker, Chairman, 
Cuthbert Daniel, 116 Pinehurst Avenue, New York 33, New York 
Prof. W. T. Federer, Cornell University, Ithaca, New York 
*H. O. Hartley 


Professor G. E. Noether, Math Department, Boston University, 725 Commonwealth 
Ave., Boston 15, Massachusetts 


. F-distribution (incomplete-beta, binomial 
*E. E. Cureton, Chairman, 
*R. L. Anderson 
P. C. Cox, 1904 _ Idaho Avenue, Las Cruces, New Mexico 
Professor David Durand, 50 Memorial Drive, Cambridge 39, Massachusetts 
*J. A. Greenwood 
*H. O. Hartley 
Gunnar Kulldorff, University of Lund, Lund, Sweden, Malmgatan 16, Malmo, Sweden 
*Dan Teichroew 


* Member of the parent committee. See List I for address 
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**H. F. Trotter 
Prof. J. G. Wendel, Math Department, University of Michigan, Ann Arbor, Michigan 


. Hypergeometric distribution (not the hypergeometric function) 

*William Kruskal, Chairman, 

Professor Leo Katz, Department of Statistics, Michigan State University, East Lan- 
sing, Michigan 

Dr. E. F. Kimball, N. Y. State Pub. Service Commission; 20 Mayfair Drive, Slinger- 
lands, New York 

Professor G. J. Lieberman, Department of Statistics, Stanford University, Stanford, 
California 

Roger H. Moore, Los Alamos Scientific Laboratory, 3448A Orange, Los Alamos, New 
Mexico 

J. M. Wiesen, 1308 Arizona NE, Albuquerque, New Mexico 


. Polyvariate normal, including latent roots 
*G. P. Steck, Chairman 
Professor T. W. Anderson, Center for Advanced Study in the Behavioral Sciences, 
202 Junipero Serra Blvd., Stanford, California 
P. C. Cox, (See Subcommittee 4 for address) 
*C. W. Dunnett 
$8. S. Gupta, (See Subcommittee 2 for address) 
Professor Ingram Olkin, Department of Statistics, Michigan State University, East 
Lansing, Michigan 
*lD). B. Owen 
M. Sobel, (See Subcommittee 2 for address 
*Max A Woodbury 


. Availability of simple techniques 
Chairman position open. 
Professor R. A. Bradley, Department of Statistics, Virginia Polytechnic Institute, 
Blacksburg, Virginia 
*E. E. Cureton 
*W. J. Dixon 
*Churchill Eisenhart 
Dr. T. A. Lamke, Bu. of Res., Iowa State Teachers College, Cedar Falls, lowa 
Professor 8. B. Littauer, Columbia University, New York 27, New York 
**H. R. Watkins 
. Annals Supplement 
*J. W. Tukey, Chairman 
J. 58. Barnes, John Wiley & Sons Inc., 440 Fourth Avenue, New York 16, N. Y 
*A.H. Bowker 
*Churchill Eisenhart 
Dr. T. E. Harris, The RAND Corporation, 1700 Main Street, Santa Monica, Calif. 
*D. B. Owen 
*Dan Teichroew 
. Cost-Free Machine Time 
*F. C. Leone, Chairman 
Professor J. W. Hamblen, Computing Center, Oklahoma State University, Stillwater, 
Oklahoma 


W. H. Horton, Materials Engineering Department, Westinghouse Electric Corp., 
East Pittsburgh, Pa. 


** Address unknown. 





NEWS AND NOTICES 347 


G. F. Lunger, Program Planning Department, Remington Rand Univac, St. Paul 
16, Minnesota 


Dr. H. A. Meyer, Director Statistical Laboratory, University of Florida, Gainesville, 
Florida 
*Max A. Woodbury 
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REPORT OF THE LOS ANGELES MEETING OF THE INSTITUTE OF 
MATHEMATICAL STATISTICS 


The Western Region Meeting, seventy-fifth meeting of the Institute of Mathe- 
matical Statistics, was held at the Los Angeles Campus of the University of 
California on December 27-28, 1957. Two sessions were joint with the American 
Statistical Association and the Institute for Management Sciences. A Special 
Invited Address was given by Claude Shannon, ‘‘Some Asymptotic Estimates 
for Sums of Random Variables.”’ The following 69 members of the Institute 
attended: 


I. J. Abrams, T. W. Anderson, H. L. Ang, Leo A. Aroian, C. B. Bell, D. L. Bentley, Allan 
Birnbaum, Colin R. Blyth, Charles Boll, Julien L. Borden, A. H. Bowker, J. V. Breakwell, 
Bernice Brown, Herman Chernoff, Edward P. Coleman, L. M. Court, Edwin L. Crow, W. J. 
Dixon, Olive Jean Dunn, H. P. Edmundson, Bob E. Ellison, T. 8. Ferguson, Evelyn Fix, 
Martin Fox, A. V. Gafarian, Edward Gammon, Norman R. Garner, E. J. Gilbert, F. A. 
Graybill, Wm C. Guenther, D. Guthrie, Jr., T. E. Harris, P. G. Hoel, John F. Hofmann, 
John M. Howell, Arnijot Hgyland, Patricia Inman, M. V. Johns, Jr., R. F. Link, Albert 
Madansky, Craig A. Magwire, F. Massey, M. R. Mickey, O. B. Moan, Roger A. Moore, 
Paul B. Moranda, James Pachares, Emanuel Parzen, M. P. Peisakoff, H. H. Peterson, Ron 
Pyke, Roy Radner, F. C. Reed, David Rothman, Marion M. Sandomire, Henry Scheffé, 
E. M. Scheuer, Franklin Sheehan, Bernard Sherman, M. M. Siddiqui, Paul N. Somerville, 
D. Stoller, Fred H. Tingey, Howard G. Tucker, John W. Tukey, H. W. von Guérard, John 
E. Walsh, Louis H. Wegner, Bryan Wilkinson. 


The program of the meeting was as follows: 


Friday, December 27, 1957 


8:45-12:00 a.m. Statistics in the Management Sciences 
(Joint Session with The Institute of Management Sciences) 


Chairman: M. R. Mickey, Jr., The RAND Corporation. 
1. The Portfolio Selection Problem, Harry Markowitz, The RAND Corporation. 


9) 


2. On the Stochastic Theory of Inventory, 1. J. Abrams, The Ramo-Wooldridge 
Corporation. 

3. Inventory Control Problems of Shipboard Supplies, Mina H. Gourary, George 
Washington University. (Read by Bernice B. Brown, The RAND Cor- 
poration). 

. Demand for and Allocation of Engineering Personnel, Rajendra Kashyap 
(introduced by H. W. von Guerard) and Hermann W. von Guerard, 
Lockheed Aircraft Corporation. 
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1:45-2:15 p.m. Special Invited Paper 


Chairman: Thomas 8. Ferguson, University of California, Los Angeles 
Some Asymptotic Estimates for Sums of Random Variables 
Claude Shannon, MIT, The Center for Advanced Study in the Behavorial 
Sciences, and Bell Telephone Laboratories. 


2:30-5:00 p.m. Industrial Applications of Statistics 


(Joint Session with the American Statistical Association ) 

Chairman: John F. Hofmann, Systems Laboratories Corporation 

1. Some Applications of Experimental Design in Industry, Alex M. Mood and 
Paul Somerville, General Analysis Corporation. 

2. The Fitting of a Polynomial Form to a Function of Several Variables by the 
Use of Orthogonal Latin Squares, N. M. Peterson, Convair, Fort Worth. 

3. Confidence Intervals for the Reliability of Multi-Stage Systems, William C. 
Hoffman, The RAND Corporation. 

. A Model for Depicting Fatigue, Irvin Whiteman, General Analysis Corpora- 

tion (introduced by A. M. Mood). 

5. Long Range Planning for Manufacturing, Glen Ghormley, Cannon Electric 
Company. 


Saturday, December 28, 1957 


8:45-10:45 a.m. Stochastic Processes Applied to Medicine and Public Health 


Chairman: Frank J. Massey, University of California, Los Angeles 

1. Replication Versus Increasing Observation Points tn the Estimation of Re- 
gression for Growth Type Data, Paul G. Hoel, University of California, 
Los Angeles. 

2. The Identifialility Problem for Functions of Finite Markov Chains, Edgar 
John Gilbert, University of California, Berkeley. 


11:00-12:00 a.m. Invited Address 


Chairman: Roger A. Moore, The Ramo-Wooldridge Corporation 
1. Statistical Theory of Some Quantal Response Models, Allen Birnbaum, Colum- 
bia University. 


1:45-2:45 p.m. Invited Address 


Chairman: O. B. Moan, Lockheed Aircraft Corporation 
1. Experiments with Mixtures, Henry Scheffé, University of California, Berkeley. 


3:00-5:00 p.m. Contributed Papers 


Chairman: Richard F. Link, Oregon State College 
1. Non-parametric Multiple-decision Procedures for Selecting That One of K 
Populations which has the Highest Probability of Yielding the Largest Obser- 
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vation. (Preliminary Report). Robert Bechhofer, Cornell University, and 
Milton Sobel, Bell Telephone Laboratories. (By Title) 

2. The Asymptotic Efficiency of Friedman’s Chi-square,-test (x;-test). Ph. van 
Elteren, Mathematical Centre, Amsterdam. (By title) 

3. Least-squares Estimation when Residuals are Correlated. M. M. Siddiqui, 
University of North Carolina. 

. A Property of Additively Closed Families of Distributions. Edwin L. Crow, 
Boulder Laboratories, National Bureau of Standards. 

. Determining Sample Size for a Specified Width Confidence Interval. Franklin 
A. Graybill, Oklahoma State University. 

}. Nonparametric Estimation of Sample Percentage Point Standard Deviation. 
John E. Walsh, Lockheed Aircraft Corporation. 

. On the Structure of Distribution-free Statistics. C. B. Bell, Xavier University 
of Louisiana and Stanford University. 

‘. Estimation of the Location of a Discontinuity in Density. J. V. Breakwell, 
Lockheed Missile Systems Division, Palo Alto, and H. Chernoff, Stanford 
University. 

. On the Supremum of the Poisson Process. Ronald Pyke, Stanford University. 

EvELYN Fix 
Associate Secretary 
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REPORT OF THE EDITOR OF THE ANNALS FOR 1957 


During the year ending August 1, 1957, more new manuscripts were received 
by the Annals, totaling more manuscript pages, than in any previous year. A 
consequent increased requirement for printing is anticipated for the coming 
year, and the Council has authorized a 1958 volume of 1300 pages. 

The 1957 volume, totaling 1098 pages, contained 105 papers and notes. The 
increased size authorized by the Council in the past two years made it possible 
to keep the backlog at less than half an issue during 1957. 

The Annals is indebted to its staff of Cooperating Members, who do much of 
the refereeing, and to the following people who have generously given refereeing 
assistance: T. W. Anderson, P. Armitage, E. W. Barankin, M. 8S. Bartlett, R. 
Blumenthal, R. Bechhofer, G. E. P. Box, L. Breiman, D. L. Burkholder, 8. 
Chandrasekhar, W. G. Cochran, W. 8. Connor, L. Cote, 8. L. Crump, F. N. 
David, M. D. Donsker, R. Dorfman, A. Duncan, M. Dwass, B. Epstein, P. 
Erdés, T. 8S. Ferguson, L. J. Folks, E. J. Gilbert, I. J. Good, L. Goodman, F. 
Graybill, U. Grenander, 8. 8S. Gupta, J. F. Hannan, M. H. Hansen, W. Hoeffding, 
P. G. Hoel, H. Hotelling, A. T. James, N. L. Johnson, E. 8. Keeping, J. H. B. 
Kemperman, D. G. Kendall, M. G. Kendall, H. Kesten, E. L. Lehmann, J. 
Leiblein, R. Leipnik, M. Loéve, E. Lukacs, J. McGregor, A. Madansky, W. G. 
Madow, M. R. Mickey, A. M. Mood, P. A. P. Moran, F. Mosteller, J. Moyal, 
R. W. Murphy, M. Newman, I. Olkin, E. Parzen, R. L. Plackett, J. Pratt, 
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R. Pyke, C. R. Rao, D. Ray, G. E. H. Reuter, J. Riordan, M. Rosenblatt, 
H. L. Royden, H. Rubin, J. Sacks, I. R. Savage, H. Scheffé, E. L. Scott, J. F. 
Scott, R. Sitgreaves, C. Streibel, R. F. Tate, D. Teichroew, A. J. Thomasian, 
H. Trotter, J. W. Tukey, D. L. Wallace, J. E. Walsh, B. L. Welch, L. Wegner, 
R. A. Wijsman, D. M. G. Wishart, G. Zyskind. 
Many thanks are due Ann Greene, Dorothy Stewart, and Margaret Wray, 
for handling the taxing work of the editorial office. 
T. E. Harris 
Editor 
December 26, 1957 
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SUMMER SESSIONS AT BERKELEY, CALIFORNIA 


The 1958 summer program in the Department of Statistics of the University 
of California, Berkeley, California, will consist of two sessions: June 16 to July 26 
and July 28 to September 6. The faculty of the summer sessions will include 
Professor U. 8. Nair of Travancore University in India, Dr. F. N. David of 
University College in London, and Professors David Blackwell, Evelyn Vix, 
Joseph L. Hodges, Jr., and J. Neyman of the Department of Statistics of the 
University of California, Berkeley. The program will include two undergraduate 
courses in each session, and two research seminars, one in statistical problems of 
health and one in the statistical study of structural relations in the physical 
sciences. 
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PUBLICATIONS RECEIVED 


Chow, G. C., Demand for Automobiles in the United States, North-Holland Publishing 
Company, Amsterdam, (1957) v-110 pp., $10.00. 

Economica, Revista de la Facultad de Ciencias Economicas, Publicacion Trimestral, Diag- 
onal 77-4 Y 5, La Plata, Buenos Aires, Argentina. 

Contributions to the Theory of Games, Volume III, Edited by M. Dresher, A. W. Tucker, 
P. Wolfe, Princeton, New Jersey, Princeton University Press, 1957 Annals of Mathe- 
matics Studies Number 39. 





BIOMETRIKA 


Volume 44 Contents Parts 3 and 4, December 1957 


Karu Pearson, 1857-1957. Centenary lecture by J. B.S. Hatpane. Lesiiz, P. H. An analysis of the data 
for some experiments carried out by Gause with populations of the protozoa, Paramecium aurelia and Para- 
mecium caudatum. Cox, D. R. & Smrrn, W. L. On the distribution of Tribolium confusum in a container. 
Watson, G. 8. The x* goodness-of-fit test for normal distributions. Satne, Y. 8. & Kamar, A. R. Approxi- 
mations to the distributions of some measures of dispersion based on successive differences. Hatiourt, F. 
Queueing with balking. Dwurein, J. Testing for serial correlation in systems of simultaneous regression equa- 
tions. Curnow, R. N. Heterogeneous error variances in split-plot experiments. Harris, A.J. A maximum- 
minimum problem related to statistical distributions in two dimensions. Roy, 8. N. & Gnanapesixan, R. 
Further contributions to multivariate confidence bounds. Stevens, W. L. Shorter intervals for the param- 
eter of the binomial and Poisson distributions. Jowett, G. H. Statistical analysis using local properties of 
smoothly heteromorphic stochastic series. ANnscompe, F. J. Dependence of the fiducial argument on the 
sampling rule. Fiecier, E.C., Harttey, H.O. & Pearson, E. 8. Tests for rank correlation coefficients. I. 
Berkson, J. Tables for use in estimating the normal distribution function by normit analysis. Moore, 
P. G. The two-sample t-test based on range. Foerer, F.G. Upper percentage points of the generalized beta- 
distribution. II. Dore, Avrson. A bibliography on the theory of queues. 

Miscellanea—Contributions by M. 8. Bartiett, B. E. Cooper, N. L. Jounson, D. 8. Parmer, A. 
R. Tuatcuer, J. W. Tukey 


Corrigenda—D. R. Cox Reviews Other Books Received 


The subscription, payable in advance, is now 54/- (or $8.00), per volume (including postage). Cheques should 
be made payable to Biometrika, crossed “‘a/c Biometrika Trust” and sent to the Secretary, Biometrika Office, 
Department of Statistics, University College, London, W.C.1. All foreign cheques must be drawn on a Bank 
having a London agency. 

Issued by THE BIOMETRIKA OFFICE, University College, London 


Colloquium Publications, Volume XX X1, Revised edition. 


FUNCTIONAL ANALYSIS 
AND SEMI-GROUPS 
by EINAR HILLE and RALPH S. PHILLIPS 


This is a completely revised and largely rewritten edition of the 
Colloquium volume of the same name by the first author. The frame- 
work of the earlier book has been kept, but the subject matter has 
been rearranged and much expanded. The algebraic tools are intro- 
duced much earlier and put to important use thoughout the treatise. 
Functional analysis occupies almost one third of the book. The sec- 
tions devoted to the analytical theory of semi-groups have been 
augmented by new material on perturbation theory, adjoint theory, 
spectral theory, operational calculus, and stochastic theory 

805 pages 

25% discount to members of the Society 


Order from 


AMERICAN MATHEMATICAL SOCIETY 
190 Hope Street, Providence 6, R.1. 





ECONOMETRICA 


Journal of the Econometric Society 
Contents of Vol. 26, No. 1 - January, 1958 


N. GeorGcescu-RoEGEN Threshold in Choice and the Theory of Demand 

F. Haun Gross Substitutes and the Dynamic Stability of General Equilibrium 

K. J. ARRow...... Utilities, Attitudes, Choices: A Review Note 

A. S. MANNE.. A Linear Programming Model of the U.S. Petroleum Refining Industry 

J. Tosin Estimation of Relationships for Limited Dependent Variables 

M. BeckMANN Some Aspects of the Airline Reservations Problem 

P. CARRE Tentative de détermination empirique de fonctions de production pour les pays industriels 

H.WAGNER A Monte Carlo Study of Estimates of Simultaneous Linear Structural Equation 

G. STUVEL The Impact of Changes in the Terms of Trade on Western Europe’s Balance of Payments 

M. FisHer.. A Sector Model: The Poultry Industry of the U.S.A. 

Economic E ficiency in Plant Operations with Special Reference to the Marketing of California Pears (French, 
Sammet, and Bressler). Review by F. V. Waugh. 

Business Forecasting, 1956 (Abramson, Mack, and others). Review by Charles F. Roos. 

Einfuhrung in die mathematische Statistik (L. Schmetterer). Review by William Feller. 

International Comparisons of Real Wages (International Labour Office). Review by Kurt W. Rothschild 

Income and Wealth. Series V. (8. Kuznets, ed.) Review by J. B. D. Derksen. 

Trends in Employment in the Service Industries (G. J. Stigler). Review by Colin Clark 

Zinstheorie (F. A, Lutz). Review by Joseph. Aschheim. 

Foundations of Productivity Analysis (Bela Gold). Review by C. F. Carter. 

Rapport sur les comptes de la nation. Review by Walter Froehlich. 

Automata Studies "Ct E. Shannon and J. McCarthy, eds.). Review by Harry H. Goode 

Economic Progress (Léon H. Dupriez, ed.). Review by Leif Johansen. 

On Economic Theory and Socialism, Collected Papers (Maurice Dobb). Review by Kenneth O. May 

Statistics: A New Approach (W. A. Wallis and H. V. Roberts). Review by Maurice Quenouille. 

=~ ao der wirtschaftlichen Nutzungsdauer von Anlagegtietern (Hansrudolf von Briel.) Review by 

ric Schif 

Structural Interdependences of the Economy: Proceedings of an International Conference on Input-Output An- 
alysis (T. Barna, ed.). Review by Robert Solow. 

International Economic Papers No. 4: Translations Prepared for the International Economic Association (Pea 
cock, Turvey, Stolper, and Henderson, eds.). Review by Paul M. Sweezy. 

Marketing E ficiency in Puerto Rico ( Galbraith, Holton, et al.). Review by Lester G. Telser. 

Méthoaologie économique (Gilles-Gaston Granger). Review by Sten Thore 

Studies in the Economics of Transportation (Beckmann, McGuire Winsten, and Koopmans). R. M. Thrall 

Capital and Its Structure (L. M. Lachmann). Review by W. P. Yohe. 
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